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ABSTRACT 


An  enumeration  algoritnm  nhicn  synthesizes  programs  from 
example  computations  is  presentee.  Tne  algorithm,  originally 
proposed  by  Alan  rf.  Biermann  or  Duite  University,  assigns  a 
labelling  of  tne  instructions  contained  in  an  example  trace 
consistent  with  producing  minimum  state  Moore  macnine 
representations  for  tne  synthesized  programs.  Tecnniques  for 
processing  tne  information  to  reduce  enumeration  are  given. 
Biermann's  algoritnm  is  extended  by  trace  preprocessing 
techniques  which  identify  and  generalize  conditions  on 
instruction  sequencing  in  tne  synthesized  programs  without 
tne  user's  assistance.  Tne  teenniques  are  presented  using 
text  editing  as  the  domain,  but  are  general  enough  to  be 
extendable  into  other  domains. 
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I.  INTRODUCTION 


A.  BACKGROUND 

Since  trie  i nt roduct ion  of  electronic  computing  Tacr.ir.es, 
manual  tastes  tnat  are  munaane,  tedious  am/or  repetitious 
nave  been  considered,  for  autOTation.  Tne  conputer  is  iaeaily 
suited  for  tnis  type  went  since  it  neitner  complains  of 
boredom  nor  wanders  from  its  assigned  tass.  Tne  machine 
meticulously  sequences  through  a  series  of  computations  over 
and  over,  producing  answers  consistent  witnin  tne 
limitations  of  tne  nardware.  As  consistent  as  tne  computer 
is  at  performing  tasfcs,  assigning  tne  tasics  is  still  left  to 
tne  user  of  tne  system. 

Programming  tne  early  macnines  was  a  difficult  cnore. 
Communications  between  man  and  macnine  were  only 
accomplishable  tnrougn  tne  language  of  tne  macnine.  Tnis 
macnine  language  consisted  of  binary  coaeo.  macnine 
operations.  Tne  efficient  macnine  language  programmer  r.ad  to 
memorize  tnese  codes  or  xeep  a  list  of  tne  codes  close  by. 
All  control  transfer  points  had  to  be  coded  in  absolute 
macnine  addresses  wnicn  tne  programmer  calculated  oy  hand.  A 
proerammmer  had  to  interpret  the  binary  representation  of 
tne  machine  operations  to  determine  the  cause  of  errors  in 
programs.  There  were  no  diagnostic  messages  to  aid  tne  user 
in  isolating  errors.  Tne  difficulty  of  programming  in 
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machine  language  led  to  a  searon  to  rinl  better  ways  cf 
generating  programs.  Tae  first  step  was  tae  recognition  tnat 
tfte  computer  was  a  good  boosseeper.  capable  of  computing 
absolute  addresses  from  labels  and  translating  mnemonic 
representations  of  macnine  operation  codes.  Webster's  New 
Word  Dictionary,  Second  Edition,  defines  mnemonic  to  be,  "a 
system  or  technique  of  improving  memory  by  tne  use  of 
certain  formulas.”  Soon  programs  were  written  wnicn  would 
accept  abstract  programs  containing  mnemonics  and  labels, 
convert  tne  mnemonics  into  macnine  operation  codes  and 
translate  tae  labels  Into  absolute  macnine  addresses.  These 
programs  produced  executable  macnine  language  code  as 
output.  These  translation  programs  were  called  assemblers 
and  tne  data  tney  translated  were  called  assembly  language 
programs. 

Assembly  language  provided  some  automation  of  tne  manual 
tastes  associated  with  machine  laneuaee  programming.  An 
important  convenience  of  assembly  language  is  tne 
readability  of  the  programs  wnen  compared  tc  macnine 
language  programs.  Tne  mnencmics  convey  tne  meaning  of  tt-»ir 
function  wnile  tne  labels  relieved  tne  programmer  of 
calculating  absolute  addresses  for  control  transfer  points. 
Assemoly  language  provided  a  level  of  abstraction  wnicn 
allowed  programmers  to  concentrate  on  tne  programming 
problem  without  deaiing  witn  every  atomic  macnine  operation. 
The  assembler  provided  oooniceeping,  address  translation  and 
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mneumonic  decoding  fast  and  efficiently.  Programmers  were 
now  capable  of  producing  more  code  in  less  time  witn  fewer 
errors  witn  assembly  language. 

Assembly  language  easea  tne  programmers  tasx  but  it 
still  could  not  oe  considered  a  panacea  for  '’omputer-ruman 
interaction.  Assembly  language  still  required  tne  programmer 
to  maintain  control  over  many  machine  operations  and  ne  nad 
to  provide  tne  logic  to  control  tne  flow  of  program 
execution.  Tne  instructions  used  to  perform  control 
functions  appears  as  similar  code  fragments  in  most  programs 
written  in  assembly  language.  Tnese  code  fragments  performed 
fuctions  such  as  controlling  brancnlng  decisions  and  Keeping 
count  of  loop  indices.  When  it  was  observed  that  common  code 
fragments  appeared  across  a  wide  range  of  assembly  programs, 
it  was  recognized  tnat  tnese  code  fragments  could  be 
represented  as  a  single  instruction  and  tne  computer  could 
translate  tne  single  instruction  into  tne  code  fragment  it 
represented.  The  programs  that  translate  these  complex 
instructions  are  called  compilers  or  lnterpeters.  Tne 
complied  or  lnterpeted  languages  that  followed  assembly 
language  In  tnls  evolutionary  process  incorporated  tne 
program  fragments  as  a  single  instruction  for  tne  language. 
Constructs  sucn  as  FOR,  DO  WHILE  and  IF  THEN  are  examples  cf 
nigner  level  control  structure  implementation. 

FORTRAN  was  the  first  in  a  long  line  of  higner  level 
languages.  FORTRAN  differed  from  tne  otners  by  becoming 


endeared  to  a  family  of  users  and  the  language  endures  today 
as  one  of  tne  most  frequently  used  higher  level  languages. 
What  dualities  of  tne  language  produced  this  popularity? 

The  FORTRAN  language  is  attributed  to  John  Baclrus.  Pis 
primary  goal  wnen  designing  tne  language  was  to  mane  tne 
language  resemble  tne  notation  used  in  nign  school  algeDra. 
Since  tne  notation  used  in  nign  scnool  algebra  was  familiar 
to  a  wide  audience,  FORTRAN  gave  a  friendly  appearance.  The 
language's  apparent  simplicity  is  tne  endearing  quality  of 
FORTRAN.  Some  other  language  implementors  failed  to 
recognize  this  point  and  tneir  languages  never  received  wide 
acceptance.  ALOOL  is  an  example  of  a  powerful  language  tnat 
never  received  tne  acceptance  anticipated. 

Otner  programming  languages  tnat  followed  added  compart 
representation  of  otner  recurring  program  fragments.  Tne 
nigner  level  constructs  were  not  limited  to  control 
structures  but  also  included  constructs  for  data 
manipulation  functions.  Iverson's  [lj  APL  (A  Programming 
Language)  provided  powerful  operators  capable  of  performing 
complex  functions  sued  as  matrix  multiplication  in  cne 
instruction. 

This  trend  continues  today.  Many  of  tne  newer  languages 
implement  sopnlstlcated  and  powerful  operators  and  control 
structures.  Some  of  these  languages  are  for  a  select  segment 
of  computer  users.  Intended  for  application  to  a  particular 
domain.  The  users  are  expected  to  be  familiar  with  the 
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domain,  so  tne  form  of  tne  language  snould  ce  familiar  to 
the  user  also.  A  problem  witn  a  domain  specific  language  is 
its  Inability  to  adapt  to  otner  areas.  To  nor*  in  anotner 
area  the  user  must  become  familiar  with  anotner  language.  A 
pnenomenon  demonstrated  by  many  computer  users  is  a 
reluctance  to  adapt  themselves  and  learn  a  new  language  tnat 
may  be  more  appropriate  for  a  given  tasfc.  Either  they  create 
tne  egg  witn  a  sledge  hammer  or  dig  tne  well  witn  a  spoon. 
When  required  to  use  a  new  language,  the  user  will  lively 
use  only  a  small  subset  of  tne  language  tnat  is  capable  of 
doing  tne  job.  Worst  than  using  only  a  subset  of  tne 
language  features  is  tne  tendency  to  bring  old  programming 
styles  applicable  to  tne  old  language  into  tne  new  language. 
Tfte  point  tnat  is  to  be  made  is  that  learning  a  new 
programming  language  is  a  nard  cnore  and  is  avoided  whenever 
possible. 

Anotner  direction  wnich  tne  automation  of  programming 
tastes  nas  tatcen  is  tne  development  of  a  programming 
envl ronment .  A  programming  environment  automates  some  of  tne 
manual  chores  by  providing  the  user  witn  aids  tnat  assist 
nim  In  constructing  programs.  Tne  environment  includes  a 
programming  ianguaee,  an  interactive  syntax-directed  editor 
and  an  on-line  debugger.  Tne  editor  provides  syntax  error 
diagnostics  walls  tne  programmer  is  creating  tne  source 
file.  The  programmer  is  forced  to  correct  the  syntax  error 
immediately  before  tne  editor  will  allow  aim  to  continue 
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proerammine.  The  error  snould  be  readily  apparent  to  tr.e 
programmer  because  it  is  in  tne  latest  input.  Tne  on-iir.e 
debu**er  allows  tne  programmer  to  actively  test  nis  program, 
nalt  execution,  cnect  tne  value  of  varlaoies,  change  tne 
value  of  variaoies  or  cnange  tne  code  itself.  Program 
environment  systems  may  even  allow  tne  programmer  to  svit'-n 
from  tne  tne  editor  to  tne  on-line  debugger  and  bacit  at  ar.y 
time.  A  prourammine  environment  can  be  summarized  as  a 
friendly  interface  utilizing  an  intelligent  editor  wnich  can 
recognize  syntax  errors  in  the  associated  programming 
language  and  one  tnat  contains  otner  Interactive  programming 
tools. 

Programming  nas  teen  called  an  art  form  requiring 
intellectual  creativity.  Tne  automation  of  intellectual 
behavior  is  a  field  of  study  witnin  Computer  Science  called 
Artificial  Intelligence.  Tne  study  of  tne  automation  of 
programming  tasics  wnicfi  require  human-line  reasoning  is 
called  Program  Syntnesis  or  Automatic  Programming.  It  is  net 
our  intention  to  provide  a  definition  of  intelligent 
behavior  for  a  macnine  since  tnere  is  considerable 
disagreement  even  among  tne  experts.  However,  we  note  tnat 
tne  goal  of  researen  in  automatic  programming  is  tne  same 
goal  that  led  to  all  tne  advances  in  programming  ian*ua«es. 
Informally,  tnis  goal  is  to  mate  tne  interaction  between  nan 
and  computer  as  painless  as  possible.  Tnat  is,  painless  for 
tne  man  but  not  necessarily  for  tne  computer.  Dijnstra  [2J 


objects  to  our  automation  or  programming  by  claiming,  ”*e 
should  not  automate  programming  even  ir  we  can,  because  it 
would  tase  away  our  enjoyment  of  the  taste.”  '*’e  note  tnere 
are  tnose  wno  may  require  tne  use  of  computer  services  tnat 
nave  neitner  tne  time  nor  inclination  to  obtain  tne  required 
education  to  do  tnat  cnore.  Tnese  include  professions  su^ti 
as  lawyers,  pnysiclans,  and  even  tneoretlcar  pcysicists.  We 
assume,  if  proerammine  becomes  fully  automated,  tr.e 
programmers  will  tnen  turn  tneir  attention  toward  other 
creative  and  stimulating  pursuits.  R.  Ramming  nas  said,  "Tne 
purpose  of  computing  is  insignt  not  numbers." 

Many  on-going  efforts  are  aimed  at  providing  better 
systems  for  tas  user  so  ne  may  create  programs  faster,  with 
less  errors  and  witn  less  effort.  Tne  aistory  of  programming 
lancuaee  development  nas  snown  tnat  automation  of  many 
programming  tasss  is  feasible.  How  muen  more  of  tne 
programming  tasirs  can  be  automated?  What  would  be  considered 
tne  ultimate  system  for  producing  computer  programs? 

B.  AUTOMATIC  PROGRAMMING 

1 .  General 

Proera.m  synthesis  or  automatic  programming  is  a 
research  topic  concerned  wlta  tne  development  of  systems 
that  provide  more  and  more  automation  of  the  programming 
process,  particularly  tnose  tasfcs  requiring  numan-liice 
reasoning.  The  goal  is  not  to  create  systems  tnat  program 
themselves,  but  to  create  systems  which  can  construct,  under 
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tne  direction  of  a  user,  programs  tnat  can  perform  some 
function  ne  desires.  Tnese  systems  must  be  easy  to  use,  easy 
to  learn,  and  increase  tne  efficiency  of  tne  user.  Tfte  users 
of  tnese  systems  will  no  longer  De  restricted  to  tne  few 
computer  professionals,  but  will  include  otner  professional 
fields  as  well  as  non-professionals.  Automatic  programming 
systems  are  to  interact  wi tn  tne  user,  recognize 
requirements,  and  tnen  syntnesize  a  correct  program  tnat 
satisfies  tne  requirements. 

Two  questions  arise  in  tne  researcn  on  automatic 
programming.  First,  wnat  is  tne  form  of  tne  interaction 
between  tne  user  and  tne  system?  Tnis  question  is  caned  tne 
specification  problem  because  It  is  concerned  witn  issues 
relating  to  now  tne  user  is  to  inform  tne  system  of  nis 
requirements.  Tne  second  question  is,  given  a  specification 
metnod,  wnat  syntnesis  tecnnlque  is  available  to  be  applied 
tnat  will  transform  tne  specification  into  an  appropriate 
program.  Tne  tecnnlque  used  for  syntnesis  is  often  dependent 
upon  tne  form  of  tae  problem  speclfi cation  and  most  of  tne 
projects  involving  automatic  programming  consider  ootn 
problems  toeetner.  It  nas  been  proposed  by  Green  13J  tnat 
tne  two  questions  snouid  be  separated  witn  researcn 
proceeding  concurrently  on  botn  problems.  He  proposes  tnere 
is  a  standard  intermediate  representation  of  tne  problem 
specification  whicn  would  permit  interaction  between  tne  two 
problems. 
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Four  techniques  nave  been  proposed  for  tne 
specification  problem  which  dominate  tne  literature  on 
automatic  programming.  Sacn  of  tne  proposed  techniques  of 
problem  specification  introduce  a  different  approacn  to  tne 
syntnesis  problem.  Tne  four  specification  tecnniques  can  be 
categorized  as  follows: 

1.  Natural  Laneuage. 

2.  Formal  Problem  Specification. 

3.  Input-output  Pairs. 

4.  Example  Computations. 

Eact  of  tnese  specification  tecnniques  will  be  dicussed  in 
tne  following  subsections  and  tne  reiationsnip  to  a 
syntnesis  approacn  will  he  discussed. 

2.  Problem  Specification  wlta  Natural  Language 

A  visionary  approach  to  tfte  specification  problem  is 
the  use  of  natural  language.  Natural  language  provides  a 
fast,  comfortable  metnoa  of  communication  wnicn  is  already 
understood  by  numans.  Implementation  of  a  natural  language 
understanding  system  nas  proven  to  be  a  very  difficult 
pro olem  (Class  l4j ) . 

Two  forms  of  natural  language  are  tne  sposen  form 
and  tne  written  form.  Understanding  spoiten  language 
Increases  tne  degree  of  difficulty  because  tne  communi cation 
is  in  the  form  of  audio  waves.  Once  the  audio  input  is 
captured,  it  must  be  converted  into  another  form  for  furtner 
syntactic  and  semantic  analysis.  The  reader  will  note  tnat 
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once  the  audio  input  nas  been  captured  ana  converted  tne 
problem  of  written  and  spotten  language  Decones  tne  same. 
That  Is,  tne  Internal  representation  of  tne  spoien  and 
written  word  can  be  tne  same  and  tne  problem  becomes  one  of 
inferring  meaning  from  tne  representation.  Future  advances 
in  voice  understanding  nardware  can  be  expected  and  tnese 
advances  may  be  expected  to  find  tneir  way  into  use. 

A.  complete  natural  language  understanding  system 
would  be  expected  to  be  able  to  understand  all  grammatically 
correct  sentences.  However,  natural  languages  do  net  nave 
finite  grammars.  Tnis  complexity  implies  a  complete 
understanding  system  cannot  De  implemented.  However,  a 
system  capable  of  understanding  a  subset  of  natural  language 
can  prove  useful  in  specific  domains.  Early  examples  of 
programming  tnrougn  natural  language  dialogue  is  presented 
in  a  survey  by  Reidorn  (5] .  Current  wort  on  understanding 
natural  language  may  be  found  in  Biermann  [5J  ,  and  Walicpr 

l?]. 

In  conclusion  natural  language  understanding  is  a 
difficult  problem  tnat  can  be  solvel  only  in  limited 
domains.  Tne  use  of  natural  language  in  programming  nas  been 
shown  to  be  possible  by  Heldorn  OJ  ,  and  by  Biermann  [6J  in 
limited  domains.  The  systems  developed  up  to  today  nave  been 
experimental  systems  and  tne  results  will  aid  in 
understanding  tne  problem.  Natural  language  programming 
systems  will  not  be  available  for  industry  for  at  least  a 
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decade.  Finally,  we  present  tne  example  Eiermann  le] 
describes  as  a  natural  language  specification  for  a  problem. 
Tnis  example  is  quoted  from  nis  paper  on  natural  language 
programming.  Its  Intent  is  to  give  a  feel  for  programming  in 
natural  language.  Tnis  example  does  not  specify  tne 
algorithm  tnat  is  to  be  used  altnouen  a  natural  lan^naere 
programming  system  would  be  capable  of  accepting  suer,  a 
specif i cation. 
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3.  Formal  Problem  Specification 


Tne  second  teennique  is  formal  specification  of  tne 

problem,  is  tne  name  implies,  tne  input  is  in  a  more  rigid 

structure  tnan  natural  language.  Tnis  teennique  allows  tne 

user  to  convey  tne  benavior  ne  desires  tne  syntnesized 

program  to  nave  vitnout  speclfyine  tne  algorithm  tnat  is  to 

be  used.  Smitn  [9J  gives  tne  following  definition  for  tr.e 

form  of  a  formal  specification  of  a  problem  A. 

~A(x )  *  z  suen  that  z  c  S  &  P(z,x)  wnere  x  c  D  i 
I(x)  wnere  D  and  S  are  tne  input  and  output  data 
types  respectively,  and  I  and  Pnare  tne  input  and 
output  conditions  respectively." 

An  example  of  a  formal  problem  specification  for  a  program 

to  compute  the  intseer  square  root  of  a  nonnegative  integer 

n  may  be  found  in  Manna  and  '.Valdinger  19]  . 
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"sqrt(n)  <==  FIND  z  SUCH  THAT 

integer ( z  )  5.  z**2  =<  n  <  ( z  A  1  )  **  2 
tfHERfi  int9.?er(n>  4.  0  =<  n‘ 

In  tne  above  example  n  is  an  element  of  tne  input  data  type, 
z  is  an  element  of  tne  output  data  type,  sqrt  is  tne  problem 
name,  integer(n)  k  0  =<  n  is  tne  input  condition,  and 
integer(z)  4.  z**2  =<  n  <  (z  +  1)  **  2  is  tne  output  condition. 

Formal  problem  specification  and  its  application  to 
tne  program  syntnesis  problem  can  best  be  explained  tnrougn 
examination  of  tne  wort  by  Manna  and  ufaidineer  I9j  ,  Manr.a 
and  tfaldlnger  [10J  ,  and  Smith  [3J  .  Altnougn  all  of  tne  wont 
is  similar  in  tnat  tne  formal  specification  is  cnanged  into 
an  appropriate  program  by  some  form  of  rewrite.  It  is 
valuable  to  differentiate  tne  approacnes  by  tneir  rewriting1 
metnods . 

Tne  first  example  is  tne  system  of  Manna  and 
Waliinger  [3J  .  Tneir  system,  caned  a  deductive  approacn, 
converts  tne  formal  specification  into  a  program  in  sore 
target  language.  Tneir  approacn,  "combines  tecnnioues  of 
unification,  mathematical  induction,  and  transformation 
rules  into  a  single  system."  Tne  following  is  an  brief 
explanation  of  this  conversion. 

A  structure  is  needed  to  contain  initial  and 
intermediate  results  of  the  conversion  process.  Tnis 
structure  is  call  a  sequent.  The  sequent  is  a  tableau 
containing  two  lists.  Tne  first  list  is  a  list  of  assertions 
and  tne  second  list  is  a  list  of  goals.  Each  element  in 
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eitner  list  nay  nave  an  output  expression  associates  witn 
it.  Figure  1  represents  a  sequent  as  a  table.  L’aon  row  in 
tne  table  -nay  contain  eitner  an  assertion  or  a  goal  but  not 
both.  Fieure  1  Is  tne  initial  sequent  for  tne  integer  square 
root  problem  given  above.  Tne  input  condition  nas  teen 
placed  in  tne  assertion  list  and  tne  output  condition  placed 
in  tne  goal  list.  Tne  output  variable  is  associated  witn  tne 
output  condition  in  tne  output  expresssion  column.  Tnis 
initiation  action  assumes  tne  input  condition  is  true  and  a 
searcn  is  attempted  for  tne  trutn  of  tne  goal  or  output 
condition. 


sqrt(n)  <==  FIND  z  SUCH  THAT 

integer(z)  and  z**2  =<  n 

and  n  <  (z+1)  2 

WHERE  integer (n)  and  (i  =<  n 
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Figure  1.  Initialized  Sequent  for  tne  Square  Root  Frobiem 
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During  tnis  sear cn  if  tne  sequent  ever  contains  a  row  wnere 
the  assertion  can  be  trivially  snown  to  be  false  or  trie  ?oai 
snown  to  be  true  and  if  tne  output  etpression  for  tnat  row 
contains  only  primitives  from  the  tareet  lane-uase  then  tne 
output  expression  is  taicen  as  tne  desired  synthesized 
pro-am. 

Once  tne  tableau  is  initialized,  tne  system's 
deductive  rules  are  applied  to  the  assertions  and  *roals.  The 
application  of  these  rules  will  cause  tne  crea,ticn  of  new 
assertions  and  soals  and  associated  output  expressions.  Tne 
rules  may  then  he  applied  to  the  new  goals  and  assertions 
until  tne  condition  for  a  program  is  satisfied.  The 
application  of  the  rules  chanee  th  entries  in  the  tableau 
wltnout  cnanglng  tne  meaning  of  tne  tableau.  We  recommend 
that  the  interested  reader  review  the  original  woric  for  a 
description  of  the  rules  and  their  application. 

Tne  attraction  of  tnis  theorem-proving  tecnnique  is 
tnat  tne  resulting  program  can  be  proven  correct  Dy  the  same 
steps  used  to  create  it.  Currently  there  is  not  a  running 
implementation  of  tnis  tecnnique.  One  or  tne  implementation 
questions  is  determining  wnat  rule  to  apply  at  eacn  step  in 
the  synthesis  process.  This  problem  can  be  viewed  as  a 
search  through  all  possible  sequences  of  rule  applications . 
This  search  space  may  become  astronomical  for  any  relatively 
complex  program  since  it  may  require  hundreds  of  rule 
applications,  inat  is  needed  is  a  mecnanism  tnat  can  control 
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the  search  in  a  reasonable  fashion.  The  form  or  control  ray 
be  neuristic  in  tnat  there  is  a  feel  for  wnere  a  rule  should 
be  applied.  If  this  intuitive  feel  can  be  quantized,  tnen 
this  technique  may  become  practical. 

Earlier  wors  by  *anna  and  Wald mger  [1<?J  on  tne 
DEDALUS  automatic  programing  system  also  required  formal 
problem  specif icat ions  .  Tne  DEDALUS  system,  an  implemented 
automatic  programming  system,  utilized  only  t ransformation 
rules.  A  tranf ormation  rule  simply  rewrites  a  portion  cf  the 
specification  into  another  equivalent  form.  The  continuous 
application  of  these  rules  would  eventually  result  in  a 
program  in  the  target  language. 

4.  Input-Output  Pair  Specification 

Input-output  pairs  is  a  method  of  describing  a 
problem  witn  examples  of  input  and  output  behavior.  For 
example,  if  someone  wanted  to  describe  a  program  to  compute 
tne  Fibonacci  numbers  tnen  ne  could  supply  tne  input-output 
pairs. 

(1.  1) 

U,  3) 

(3.  5) 

(5,  S) 

(8,13) 

The  goal  of  a  synthesizer  system  is  to  determine  tne 
desired  program  from  the  examples  of  the  input-output 
benavlor.  One  approach  is  to  enumerate  all  possible  programs 
in  the  target  language  in  order  and  test  each  program  for 
tne  desired  benavlor.  Tnat  is,  test  eacn  enumerated  program 
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oy  giving  it  the  input  from  each  of  tne  examples  ana  see  if 
tne  program  will  give  tne  associated  output.  Tne  enumeration 
will  produce  tne  correct  program  at  sore  point  tut  you 
cannot  determine  if  an  arbitrary  program  can  produce  tne 
desired  benavior  (see  Biermann  [  1 1 J  ^  .  Tnerefore,  tne 
following  taeorem  is  given  by  Biermann,  "Tne  programs  for 
tne  partial  recursive  functions  cannot  ce  generated  from 
sample  of  input-output  benavior."  A  large  class  of  programs 
may  be  inferred  from  examples  of  Input-output  pairs  provided 
tney  belong  to  tne  class  of  programs  wnere  tne  Salting 
problem  is  decidable.  Smitn  [12]  and  Summers  [13]  nave 
looked  at  the  synthesis  of  LISP  programs  for  example 
input-output  pairs.  It  has  been  snown  that  a  restricted 
class  of  LISP  programs  can  be  synthesized  from  example  pairs 
without  enumeration  over  tne  class.  The  reader  is  invited  tc 
review  Biermann  [14j  and  Gold  [15]  for  theoretical 
background  information. 

5.  Example  Computations 

Program  specification  using  example  computations 
allows  more  Information  to  be  obtained  from  tne  user.  An 
example  computation  is  a  sequence  of  instructions,  without 


an  explicit  control  structure,  which  the  user  provides  tie 
system  in  order  to  describe  the  benavior  ne  wants  from  a 
program.  Examples  are  a  good  communication  metcod  vnlon 
people  use  to  describe  new  concepts  or  explain  new 
processes.  To  describe  a  problem  to  the  computer  the  user 


uses  tne  available  instructions  ana  provides  an  example  of 
wnat  he  wants  done.  Figure  2  snows  an  example  rompu ta ti  on 
tnat  demonstrates  now  to  compute  tne  first  Id  Fibonacci 
numbers . 

In  Figure  2  tne  two  operand  instructions  fro?.  ADD) 
perform  tne  action  on  tne  two  operands  and  leave  tne  result 
in  tne  first  operand.  For  example,  if  A  =  2  and  B  =  3  tnen 
ADD  A,B  would  result  in  A  =  b  and  B  =  3.  All  of  tne 
instructions  perform  action  on  some  variables  execpt  for  tne 
START,  HALT,  and  NOTE  instruction.  START  and  HALT  flag  tne 
begin  and  end  of  tne  program  respectively.  Tne  NCTS 
instruction  is  providing  information  on  tne  reason  for  tne 
execution  of  tne  next  instruction. 

This  method  of  specification  depends  on  tne  user  to 
supply  more  information  about  tne  problem,  including  the 
aigoritnm  to  be  syntneslzed.  Tne  algoritnm  is  implicitly 
defined  by  tne  example  computation  tnat  is  given.  This 
specification  technique  snould  be  contrasted  with  tne 
previous  tecnnioues.  Note  tnat  tne  formal  specification  and 
the  input-output  pair  specification  only  required  tne  user 
to  specify  tne  desired  benavior  witnout  specifying  tne 
algoritnm.  Thus  it  can  be  claimed  tnat  tnese  two  metnods 
intentionally  ignore  information  tnat  tne  user  nas,  assuming 
that  most  users  have  an  idea  of  tne  form  of  the  algorithm. 
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Tne  primary  contributor  to  tne  understanding  of 


program  syntnesis  nas  been  Alan  4.  Biermann  (see  Bierxann 
and  Arisnnaswamy  [16J  and  Biermann,  Baum  and  Petry  (.17J).  In 
particular,  Eiermann  116J  provides  a  formal  definition  of  an 
aigoritnm  tnat  will  syntnesize  programs  from  example 
computations.  The  aigoritnm  and  variations  nave  provided  tne 
Dasic  structure  upon  wnicn  tnis  tnesis  nas  been  developed. 
Briefly,  tne  aigoritnm  identifies  tne  conditions  tnat  may 
nave  inadvertently  (or  purposely)  been  left  out  of  tne 
computation.  A  condition  is  a  predicate  as  defined  in 
predicate  calculus.  Tnat  is,  an  entity  for  wnich  a  trutn 
value  may  be  measured.  Once  tne  omitted  conditions  nave  been 
inserted,  tne  aigoritnm  finds  a  labelling  for  tne 
instructions  suca  tnat  a  program  witn  a  minimum  number  of 
instructions  is  produced.  To  explain  tnis  labelling,  assume 
tne  instruction  ADD  A ,£  appears  in  three  different  locations 
in  an  example  computation  (see  figure  2).  Suppose  it  was 
Known  that  there  nas  to  oe  two  occurrences  of  tne 
instruction.  Tnen  two  of  tne  instructions  could  De  labeled 
wltn  a  1  and  tne  otner  instruction  iaoelea  witn  a  2  to 
indicate  that  the  instruction  labeled  2  is  different  from 
tne  instructions  laoeied  l.  Finding  tne  labels  for  tne 
instructions  in  tne  example  computations  requires  an 
enumeration  searcn  of  all  possible  labellings.  Tne  labelling 
selected  is  tne  first  labelling  tnat  produces  a  program  tnat 
is  deterministic. 
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This  aleorltnm  is  compete  and  tne  synthesized 
programs  are  sound.  Completeness  means  tnat  tne  algorithm 
can  synthesize  every  possioie  program.  Soundness  mean  tnat 
tne  synthesize  program  will  correctly  execute  tne  example 
used  to  construct  it.  A  disadvantage  of  tms  synthesis 
method  is  the  algorithm  is  an  enumeration  searcn  and  i.:  the 
worst  case  will  require  exponential  time  on  tne  length,  of 
tne  example  computation  to  find  a  solution.  Techniques  nave 
been  developed  to  speed  up  this  searcn  tnat  will  produce 
satisfactory  response  for  most  praticai  programs. 
t>.  A  Seneral  Automatic  Programmer  Design 

Before  leaving  tnis  section  on  automatic  program  we 
wisn  to  discuss  a  design  for  an  automatic  programmer  that 
uses  at  least  two  of  tne  specification  techniques.  Tne  name 
of  the  system  is  PS  I  and  was  designed  by  a  group  of 

researchers  at  Stanford's  Artificial  Intelligence 
Laboratory.  Tne  researcn  effort  was  nealed  by  Cordell  Green 
[51 .  Green  nas  presented  a  nlgn  level  design  of  an 
autoprogrammer  tnat  identifies  some  of  tne  more  important 
areas  tnat  need  further  researcn.  Green  admits  tnat  tne 

design  was  an  effort  to  focus  attention  on  some  of  tne 

sub-areas  of  tne  overall  synthesis  problem.  His  modular 
design  does  focus  attention  on  different  aspects  of  tne 

problem.  The  design  decision  to  split  tne  overall  problem 
into  two  main  sub-problems  of  acquistion  and  syntnesis  is  of 
particular  interest.  Tnis  design  choice  allows  worn  to 
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proceed  concurrently  on  two  nard  problems  witn  tne  interface 
between  tne  problems  oelng  some  intermediate  representation 
of  tne  problem. 

PSI  is  a  fcnowiedge-based  program  unde rs tandi r.g 
system  organized  as  a  collection  of  interacting  modules. 
Figure  3  details  tne  Men  level  modular  design  of  tne  PSI 
system.  Tne  PSI  design  divides  tne  system  into  two  groups. 
The  acquisition  group  Interfaces  witn  tne  user  and  collects 
tne  specification  given  by  tne  user  while  tne  syntnesls 
group  produces  a  proeram  in  some  target  language  tnat  meets 
the  user's  requirements.  Communications  between  tne  two 
major  groups  is  tnrougn  an  intermediate  representation 
called  tne  program  model.  The  goal  of  tne  acquisition  group 
is  to  accept  tne  user's  specification  by  eitner  natural 
language  dialogue  or  by  traces,  and  present  a  unified  entity 
to  tne  synthesizer  group.  Tne  implementation  of  tne 
synthesizer  group  is  then  simplified  because  of  tne 
consistent  representation  It  receives.  Since  tne  user's 
Input  is  converted  into  an  intermediate  representation  tnat 
is  supplied  to  tne  syntnesizer  group,  tne  user  is  free  to 
swltcn  from  one  specification  tecnnique  to  anotner  during 
program  specification. 

The  overall  interaction  with  tne  user  is  meant  to  oe 
through  natural  language  dialogue.  Since  natural  language 
understanding  is  not  currently  witnin  tne 


Figure  3.  PS1'*  Modular  Design  {.3,p.6J 


state  of  the  art,  the  system  must  interact  in  a  subset  of 
natural  language  limited  to  a  particular  domain. 

The  system-user  interaction  is  to  appear  as  natural 
as  possible.  Tne  system  nas  been  designed  to  include  a 
mixed-initiative  dialogue  capability  wnicn  means  tne  user  or 
tne  computer  can  assume  tne  dominant  communication  role  at 
different  times  luring  tne  discourse.  Tnis  allows  tne  user 
to  provide  as  mucn  Knowledge  as  ne  can  to  nelp  tne  syntnesis 
process  and  allows  tne  computer  to  assist  tne  user  by  as  King 
questions  or  providing  responses.  Tne  system  develops  a 
current  model  of  tne  user  and  a  model  of  tne  context  tnat 
assists  tne  system  in  determining  wnen  to  assume  tne 
initiative  and  wnat  questions  to  asK  tne  user. 

A  partial  implementation  was  completed  in  iy?b  tnat 
included  tne  syntnesis  expert  and  tne  efficiency  expert  from 
tne  syntnesis  group.  The  acquisition  group  modules  nave 
proven  to  be  a  more  difficult  assignment  and  only  portions 
of  tne  acquistion  group  nave  been  implemented.  Tne  Important 
point  of  tne  FSI  design  is  tnat  it  provides  a  modular 
division  of  tne  program  syntnesis  problem  tnat  neips  provoKe 
study  into  tuese  sub-problems. 

C.  OBJECTI7BS 

Automatic  programmers,  wnicn  syntnesize  programs  from 
example  computations,  require  conditions  to  be  explicitly 
defined  by  tne  user  in  order  to  generate  programs  witn  a 
minimum  number  of  instructions.  Previous  worK  (  Biermann  and 
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Tne  explicit  definition  of  conditions  is  not  a  natural 
part  of  an  example  computation.  Tnat  is,  one  would  not 
normally  give  control  structure  information  when  using 
examples  to  explain  now  a  tass  is  to  be  performed.  Our 
objective  is  to  provide  an  environment  wnere  tne  user  may 
define  tne  tastes  ne  wants  accomplished  witnout  explicitly 
defining  tne  control  structures  tnat  specify  tne  flow  of 
execution  in  a  syntneslzed  program. 

We  will  implement  an  automatic  programming  system  based 
upon  tne  example  computation  specification  metnod  in  order 
to  study  tne  feasibility  of  Identifying  conditions  from  user 
actions.  We  ilmit  tnis  study  to  tne  domain  of  text  editing 
in  order  to  provide  a  well  defined  area  in  wnicn  to  wore.  It 
is  doped  tnat  tne  results  of  our  efforts  may  provide  insight 
into  tne  overall  problem  and  generate  furtner  researen  wnicn 
will  extend  condition  Identification  to  otner  domains. 

D.  THESIS  OHSANIZmON 

The  thrust  of  this  thesis  is  the  deveiopement  of  metnods 
for  the  automatic  construction  of  conditions  necessary  for 
the  proper  synthesis  of  programs  from  example  computations. 
Example  computation  is  one  approacn  to  tne  problem  of 
program  synthesis .  Chapter  One  introduces  tne  reader  to 


program  synthesis  ana  gives  a  orie?  historical  perspective 
of  tne  evolution  of  tnis  field  of  study.  Cnapter  One  also 
provides  a  comparison  of  tne  different  proposed  approacnes 
to  tnis  problem. 

An  automatic  programmer  nas  Been  implemented  to  support 
tnis  researcn.  Tnis  symneslzer  was  developed  to  use  tne 
example  computation  metnod  for  program  specification. 
Cnapter  Two  is  a  detailed  explanation  of  our  particular 
implementation.  Cnapter  Two  includes  a  discussion  of 
techniques  we  nave  incorporated  in  our  implementation  which 
speed  up  tne  syntnesis  process. 

Cnapter  Three  presents  our  approach  to  ^eneratin* 
conditions  given  an  example  computation.  It  lescrioes 
aleoritnms  which  will  esnerate  conditions  from  a  sequence  of 
editor  instructions . 

Cnapter  Four  discusses  tne  result  of  our  research.  A 
orief  discussion  is  included  on  tne  merits  of  tne 
synthesizer  watch  we  nave  Implemented  and  recommendations 
are  given  for  potential  improvement.  Finally,  Cnapter  Four 
presents  a  review  of  our  woric  on  identification  and 
construction  of  condtions  from  example  computations.  Areas 
requiring  furtner  researcn  have  Been  hlenllented  and 
examples  of  possible  applications  to  otner  domains  nave  neen 
pointed  out. 


II.  SYNTHESIZER 


A •  GOALS 

Tnere  is  a  two-foil  purpose  benind  designing  and 
building  tne  program  synthesizer.  Toe  first  directly  relates 
to  the  usefulness  of  tne  syntnesizer.  It  is  noped  tnat  by 
"laying  tne  groundwork"  for  an  autopr ogramming  syst°m,  tne 
impetus  will  be  provided  tnat  will  eventually  result  in  a 
total  automatic  programming  environment  teing  available  for 
tne  user.  Tnis  environment  is  envisioned  as  an  interactive 
one  consisting  of  several  components:  an  interface  to 
provide  tne  user  witn  the  means  to  perform  example 
computations,  a  link  between  tne  interface  and  tne 
synthesizer  wnicn  records  the  user  actions  and  transmits  a 
trace  of  tnose  actions  to  tne  syntnesizer,  tne  syntnesizer 
Itself  which  produces  the  algorithm  in  some  internal  form, 
and,  finally,  a  translator  tnat  receives  tne  internal 
representation  of  tne  algorithm  and  translates  it  into 
machine-readable  form  and/or  user-readable  form.  The  second 
purpose  for  wnicn  the  synthesizer  is  built  is  to  orovide  a 
suitable  vehicle  to  be  used  in  the  main  area  of  research 
tnat  tnis  tnesls  explores.  If  an  autoprogrammer  can  generate 
correct  algorithms  from  example  computations,  how  "-uch  can 
be  done  to  relieve  tne  user  from  having  to  include  crancning 
or  looping  conditions  in  his  example  computations? 
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B.  OVERVIEW 


1.  Oeneral  Description 

An  automatic  programming  system  wci^n  procures 
programs  cased  upon  tne  user's  input  of  example  computations 
ft  as  a  natural  appeal.  Example  computations  are  sea'ien^es  of 
instructions  performed  in  an  algorithmic  manner.  Jor 
instance,  if  tne  user  is  doing  a  matrix  multiply,  computing 
tne  entry  for  tfte  resultant  matrix  involves  tne  su^  of 
products  from  tne  appropriate  row  and  column  of  tne 
multiplicand  and  multiplier  matrices,  respectively,  tfften 
numans  communicate  ideas  to  eacn  otner,  tne  proper  use  of 
example  computations  often  plays  a  vital  role.  It  is  nari  to 
imagine  trving  to  explain  tne  metnol  of  multiplying  t*o 
matrices  together,  or  trying  to  explain  tne  concept  of 
set-subset  relati onsnlps  witnout  being  able  to  draw  examples 
taat  enaance  tae  explanations.  Tais  metncd  of  communication 
seems  to  be  vital  to  numan  understanding  of  airoritnms. 
Since  programmers  often  use  smaii  example  computations  wr.iie 
coding  programs,  it  seems  tnat  a  logical  api.roa'T  to 
automatic  programming  would  consist  of  tne  macnine  doing  tne 
actual  program  syntnesis  based  upon  example  computations 
given  by  tne  programmer. 

Program  syntnesis  is  tne  act  of  putting  instructions 
togetaer  in  sucn  a  way  tnat  an  algorithm  is  built  wnicn 
accompllsnes  a  desired  taste.  Obviously,  an  algorithm  wnic.o 
is  an  exact  replication  of  tne  sequence  of  instructions  will 
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accomplish  tne  tasit,  cut  it  is  uninteresting  since  it  cannot 
be  generalized  to  accomplish  a  set  of  related  tasfcs.  For 
example,  a  linear  sequence  of  instructions  wnicn  multiplies 
two  2x2  matrices  together  will  oniy  wont  for  2x2 
matrices*  nowever,  by  allowing  loop  constructs  and  if-tnen 
constructs,  an  algorithm  can  be  produced  wnicn  performs  tne 
more  general  taste  of  multiplying  any  two  matrices  with  legal 
row  and  column  dimensions.  So,  in  the  case  of  tne  matrix 
multiply,  the  taste  of  the  program  synthesizer  is  to  produce 
a  general  matrix  multiply  algorithm  given  tne  example 
computation  for  a  2  x  2  matrix  multiplication  in  some  form 
such  as: 


c[l.lj  =  a  [l  ,lj 

b  [1.1J 

♦  a[l,2J 

*  b [2 ,lj 

c[l,2j  *  all.l] 

bll,2] 

+  all ,2J 

*  bl2,2j 

c[2,lj  =  a[2,lj 

V 

b [1 ,1 J 

*  a  [2 ,2J 

*  b [2 ,lj 

c [2 ,2]  =  a  12,1] 

V 

b ll ,2J 

♦  al2,2j 

*  bl2,2j 

Generalizing  from  tne  example  computation  also 
requires  some  means  of  noting  when  the  array  bounds  nave 
been  reacned  for  this  example.  In  other  words,  conditions 
have  to  be  interposed  between  some  instructions  wnere  a 


cnange  in 

tne 

flow 

of  control 

for  tne 

algori tnm 

is 

necessa  ry . 

An 

Input 

trace  is 

defined  as 

a  sequence 

of 

instructions  and  conditions  wnicn  describes  tne  example 
computation.  In  tne  matrix  multiply  example  this  might  be 
accomplished  tnusly: 
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cu.i] 
c  Ci*  U 


c  [i ,  ij  =  u 
ctl.l]  +  A  [  1 , 1 J 
C  [1 , 1 J  +  A [1,2] 


* 


BU.lJ 

B[2,1J 


COND  -  col  index  of  A 


col  size  of  A 


CU.2J  =  J? 

C  [  1 , 2]  =  C  [1,2]  +  A  [1 ,  lj  *  B  [1  ,2  J 
Cll.2]  =  C  [l , 2 ]  ♦  A [1 , 2 J  *  B [2 ,2J 


COND  -  coi  index  of  A 


col  size  of  A 


C[2,2J  =  C  [2,2]  +  A  [2 , 2J  *  B[2,2j 

COND  -  row  4  col  index  of  C  =  Dimension  of  C 

STOP 

The  program  synthesizer  used  for  this  thesis  is 
designed  around  concepts  and  ideas  on  syntnesl zing  a  program 
given  example  traces  as  described  in  reference  [1?] . 
Previous  researcn,  references  [16],  [17],  and  [18],  seems  to 
indicate  tna t  correct  programs  can  be  synthesized  on  the 
basis  of  relatively  few  sample  computations,  Dut  that  tne 
amount  of  time  required  to  do  tne  syntnesis  grows  very 
quicicly  as  a  function  of  program  complexity. 

2.  Trace  Coding 

Tne  syntnesis  procedure  is  domain  independent;  that 
is,  the  input  trace  can  be  coded  into  any  consistent 
representation,  and  it  will  not  affect  the  operation  of  the 
synthesizer.  Since  the  synthesis  procedure  is  independent  of 
the  input  trace  representation,  alphanumeric  characters  will 
be  used  to  represent  instructions  and  conditions.  They  are 
distinguished  from  each  other  by  their  position  within  the 
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trace  ratner  tnan  by  tneir  symbolic  representation.  Fcr 
example,  an  'a'  mlent  represent  an  instruction  or  a 
condition,  Within  tne  instruction  set  itself,  identical 
instructions  are  encoded  as  identical  symbols,  A  simple 
trace  of  a  routine  to  find  ail  positive  numbers  in  an  input 
stream  mignt  be: 

A  =  0 
READ  2 

COND  -  B  is  negative 

A  =  A  +  1 
READ  B 

COnD  -  B  is  negative 

A  =  A  +  1 
READ  E 

COND  -  B  is  positive 

PRINT  B 

« 

• 

If  tde  instruction  A=A+1  is  represented  by  a  'o',  eacn 
occurrence  of  tnat  instruction  in  tne  trace  will  nave  to  be 
represented  by  a  'b'.  The  reason  for  this  constraint  is 
obvious.  Since  tne  synthesizer  only  receives  a  tra^e  of  tHe 
example  execution,  it  cannot  determine  wnetner  A=A+1  is  tne 
same  instruction  being  encountered  repeatedly  in  a  loop,  as 
it  is  in  this  example,  or  waetner  there  are  several 
independent  occurrences  of  A=A+1.  Figure  4  is  an  example  of 
a  typical  coded  input  trace.  Tne  left-nand  column  entries 
are  conditions  and  tne  rignt-nand  column  entries  are 
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instructions.  Figure  4  is  read  as  state  's  transistions  on 
condition  'x'  to  state  'a'  wnicn  in  turn  transitions  on  'x' 
to  state  ' o ',  and  so  fortn. 


transitions  states 


Figure  4.  Input  Trace 


3.  Input/Output  Trace  Representation 

A  Moore-type  representation,  as  defined  in  [1VJ  ,  can 
be  used  to  hi*ftll*at  certain  features  that  must  be  dealt 
vita  wnen  producing  an  algoritnm  from  an  example  trace. 
Throughout  the  rest  of  the  discussion,  Moore  machines  and 
algorithms  will  be  used  synonymously.  Conditions  relate  to 
transitions  and  instructions  relate  to  states  of  the 
machine.  In  fact,  tne  function  of  tne  syntnesizer  can  be 


viewed  as  that  of  determining  a  minimum-state  deterministic 


40 


Moore  machine  equivalent  of  a  non-deterministic  Moore 
machine.  Representing  input  traces  as  Moore  macninps  will 
often  snow  tne  non-deterministic  structure  of  tne  example 
trace.  Tnis  non-determinism  must  be  resolved  by  tne 
syntnesizer  in  order  for  an  algorithm  to  be  generated. 
Figure  5  is  tne  Moore  machine  representation  of  tne  innut 
trace  of  Figure  4.  Notice  that  at  node  'b',  tne  trace  is 
non-deterministic.  Transition  'y'  leads  from  node  'b'  to  two 
different  nodes?  similarly,  transition  'x'  leads  from  node 
'b'  to  two  separate  nodes.  Figure  6  is  tne  deterministic 
Moore  machine  which  has  been  constructed  by  our  synthesizer 
based  upon  tne  input  trace  given  in  Figure  4.  The 
non-determinism  has  been  resolved  by  splitting  state  'a' 
into  two  states  distinguished  from  eacn  other  by  an  integer 
prefli  label.  The  assignment  of  the  prefix  label  is  the 
mechanism  used  by  tne  synthesizer  to  prevent 
non-determinism.  In  order  to  accomplish  this  assignment,  the 
syntnesizer  uses  an  enumeration  tecnnique.  Eacn  instruction 
is  assigned  a  prefix  label  in  a  manner  that  maintains 
determinism  and  assures  that  the  algorithm  will  correctly 
execute  tne  input  trace.  It  is  easy  to  verify  that  tne 
deterministic  Moore  machine  of  Figure  6  will  execute  tne 
trace. 
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c.  synthesis  procedure 


1 .  Function 

Tne  function  of  tne  syntnesizer  program  is  to 
provide  a  minimum-state,  correct  program  consistent  witn  tee 
input  trace  of  tne  example  computation.  Tne  syntnesis 
process  will  ee  completed  wnen  it  is  determined  w r.  i^n 
occurrence  of  a  iaeeiisd  instruction  corresponds  to  eac.n 
particular  instruction  in  tne  input  trace.  Ir.  order  to 
accomplisn  tnis  goal,  tne  syntnesizer  is  basically 
structured  as  a  deptn-first  searen  aigoritnm.  Baciup  ari 
fixup  mechanisms  exist  to  ennance  tne  searen  procedure  wnen 
pruning  nas  not  tept  tne  algorithm  from  traversing  a 
fruitless  nranen  of  tne  searen  tree.  Tne  spar~n  mecr.anisn 
attempts  to  assien  a  lacel  to  eacn  instruction  in  eucn  a 
manner  tnat  tne  generated  algorithm  remains  technically 
correct;  tnat  is,  nondeterminism  is  not  allowed  to  °xist  and 
tne  orielnal  trace  can  still  te  executed.  \  numDer  rf 
teenniaues  exist  witnin  tne  syntnesizer  wri~n  ail  pruning  of 
tne  searen  tree,  and  thereby  ma^e  it  possible  to  «yntnesize 
more  complicated  programs  in  a  reasonable  amount  of  time 
tnan  could  otnsrwlse  oe  expected  from  a  general  enumeration 
tecnnlque.  Tnese  techniques  offset  tne  major  disadvantage  of 
exponential  erowtn  of  tne  searen  space  as  a  function  of 
input  vnicn  is  found  in  a  general  enumerative  searen 
tecnnlque. 


2.  Concepts 

Certain  definitions  and  concepts  must  re  presented 
Defore  the  actual  algorithm  is  discussed.  In  order  to 
facilitate  tne  discussion,  it  is  necessary  tc  refer  tc 
Fieure  7.  Sach  level  in  tne  figure  consists  or  an 
Instruction-condition-instruction  trj  pie .  referred  to  as  an 
I-C-I.  In  Figure  7  tne  leftmost  s yrroi  under  I-C-I  is 
referred  to  as  the  leading  instruction  cf  tne  triple,  tne 
middle  symbol  is  the  condition,  and  the  rightmost  svm.toi  is 
tne  trailing  Instruction.  Tne  trailing  instruction  at  level 
i  becomes  the  leading  instruction  at  level  i+1.  So  this 
input  trace  represents  tne  instruction-condition  sequence  's 
r  a  n  s  r  a  . . . ' . 

level  I-C-I 

1  sra 

2  ans 

i  sra 

4  ala 

5  a  xa 

5  ay  a 

7  axa 

9  anr 

Figure  7.  Instruction-Conn ti on-I nstructlon  Triple 

Two  levels  1  and  j  are  said  to  belong  to  tne  same 
couple-class  if  tne  elements  of  tne  level  are  the  same. 
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Instruction  elements  of  tne  trace  wnicn  are  in  tne  same 
couple-class  may  De  assigned  tne  same  prefix  label  during 
syntnesls  if  tne  assignment  does  not  cause  non-determinism. 
For  example,  <?iven  tne  trace  in  Figure  ?,  levels  1  and  3  are 
in  tne  same  couple-class,  as  are  levels  5  and  7.  Difference 
set  relations  are  anotner  situation  tnat  can  exist  wr.icn  is 
of  interest.  Tne  first  two  elements  of  level  i  and  level  J 
are  tne  same,  but  tne  tniri  element  is  not  tne  same.  A 
difference  set  relation  indicates  tnat  tne  leading 
instructions  cannot  be  represented  by  tne  same  state 
regardless  of  tne  prefix  laoel  assigned  during  syntnesis 
because  tne  leadine  instruction  nas  tne  same  transition  to 
two  different  trailing  instructions.  Again  usinc  tne  above 
trace,  level  2  and  level  8  fall  into  tnis  category.  In  tms 
situation,  tne  index  8  would  be  entered  into  tne  difference 
set  for  level  2.  By  implication,  tne  index  2  is  also  in  tr.e 
difference  set  for  level  8,  altnou?n,  in  practice,  it  is  not 
entered . 

Once  tne  initial  couple-class  information  and 
difference  set  information  nave  been  determined,  additional 
difference  set  information  can  be  obtained  tnrougn  tne 
chaining  nature  of  differencing.  For  example,  suopose  tne 
trace  consists  of  tne  one  snowa  in  Figure  8.  Tnen  me  Moore 
machine  representation  of  tnis  trace  is  snown  in  Figure  y. 


index 


trace 


b  axa 

6  axa 

7  ays 


u 

y 

10 

Fieure  B.  Cnainin*  of 


axa 
axa 
ay  t 

Difference  Set  Relations 


Figure  9.  Non-deterministic  Input  Trace 

Tnis  macnine  is  obviously  nondetermlnistlc  since 
state  'a'  transitions  by  'y'  to  two  different  states. 
Difference  set  resolution  requires  tnat  tne  index  for  'ayt' 
be  in  tne  difference  set  of  'ays'.  Since  tnat  requirement 
causes  different  states  to  represent  tne  'a'  in  'ayt'  and  in 
'ays',  and  furtner  since  tne  trailing  'a'  in  tne  preceding 
level  Is  exactly  tne  same  instruction,  tne  preceding  levels 
now  satisfy  tne  difference  set  relation.  Tne  leading 
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instruction  and  tne  condition  are  tne  same,  out  tne  trailing 
instruction  in  tne  I-C-I  triple  is  different  since  t.ney  nave 
previously  Seen  assigned  to  a  difference  set  relation. 
Tnerefore,  tne  lead  instruction  must  be  labelled  witn  a 
different  prefix  during  assignment  and  similarly,  tne  levels 
above  tnem.  So  tbe  Moore  macnine  will  now  be  deterministic 
and  in  tne  following  form. 


Figure  145.  Deterministic  Trace 

Given  a  partial  trace  derived  from  tne  example 
execution,  mere  are  numerous  Moore  macnines  tnat  ''an  be 
constructed  to  satisfy  tne  trace.  At  one  end  of  tne 
spectrum,  a  program  can  be  constructed  sucn  tnat  earn 
succeeding  state  is  assigned  a  different  prefix  label.  Tnis 
metnod  always  results  in  a  straier.t-iine  program.  Facn 
instruction  nas  one  transition  entering  it  and  one 
transition  exiting  from  it.  Allowing  tnis  metnod  produces 
tne  maximum  size  program  consistent  witn  tne  input  trace. 
See  Figure  11.  Tnis  is  not  a  particularly  desirable  metnod 
since  it  does  not  recognize  loop  structures  tnat  can 
significantly  reduce  tne  size  of  tne  program.  Additionally , 
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condition  ins  true  tion 
a 

i  a 

x  a 

x  a 

Figure  11a.  Trace  Figure  lib. Program 

Figure  11.  S traignt-ii ne  program 

On  tne  otner  end  of  tne  spectrum,  a  program  can  te 
constructed  suen  mat  eacn  identical  instruction  receives 
the  same  prefix  label.  This  method  taites  full  advantage  of 
loop  structures,  and  will  result  in  a  minimum  state  macnine. 
However,  such  a  metnod  will  seldom  produce  a  deterministic 

machine?  therefore,  it  will  not  produce  a  satisfactory 
algoritnm.  See  Figure  12. 

level  coni  1 nstr 


Figure  12a.  Trace  Figure  120.  Program 


Figure  12.  Minimum  State  Machine 


Tie  nest  solution  lies  somewnere  between  these 


endpoints.  A  reasonable  first  suess  at  the  number  of  states 
required  to  produce  a  deterministic  macaine  witnin  tnis 
spectrum  can  ds  made  by  es ta bl isni r.<?  a  lower  bound  on  tne 
number  of  states.  Tne  cardinality  of  tne  instruction  set  is 
defined  as  tne  number  of  different  instructions  appearing  in 
tte  trace.  Using  tne  above  figure  as  an  example,  it  can  be 
determined  tnat  tne  cardinality  of  tne  instruction  set  is 
two*  tnat  is,  tnere  are  two  different  instructions,  'a'  and 
'b',  in  tne  trace.  Tnis  measure  provides  an  absolute  lower 
bound  on  tne  number  of  states  required  in  tne  final  ma~nine. 
Tnis  lower  bound  can  be  refined  by  determining  a  lower  bound 
on  the  number  of  states  needed  for  eacn  individual 
instruction.  Once  again,  using  tne  above  figure  as  an 
example  illustrates  tnis  concept.  Tne  instruction  'a'  at 
level  5  must  be  different  tnan  tne  instructions  at  levels  1 
tnrougn  4  because  of  difference  set  resolution,  or  else 
nondeterminism  results  on  tne  transition  'y'.  Therefore,  in 
order  to  maintain  determinism,  tne  instruction  'a'  must  re 
allowed  at  least  two  states.  Summation  of  tne  lower  cour.cs 
for  eacn  of  tne  instructions  gives  a  lower  bound  on  tr.e 
total  number  of  states  required  for  tne  macnine.  For  tnis 
particular  example,  tne  program  would  be  eenerated  as: 
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X 


Figure  13.  Instruction  Set  Lower  Bounds 

If  tne  searcn  space  is  viewed  as  a  tree  structure 
tflen  tne  levels  of  tne  tree  can  be  associated  wi tn  tne 
Instructions  by  assigning  tne  first  instruction  in  tne  input 
trace  to  tne  first  level,  tne  second  instruction  to  tne 
second  level,  ana  so  fortn.  Tne  crancning  factor  at  eacn 
level  is  tne  state  lower  bound  computed  for  tne  instruction 
seen  at  tnat  level.  Tne  prefix  label  assigned  to  tne 
instruction  is  represented  by  tne  specific  orancn  used  to 
traverse  to  tne  next  level. 

Tne  ilea  of  providing  a  lower  bound  on  tne  number  of 
states  leads  to  an  iteratively  expanding  leutn-flrs t  sear^r. 
tfnen  all  possible  combinations  of  prefix  labels  nave  been 
tried,  but  tne  algoritnm  remains  non-deteririnistic ,  tr.e 
lower  bound  is  incremented  and  tne  searcn  is  restarted  from 
tne  top  level.  Wnen  tne  lower  Dound  is  increased,  tne  searcn 
tree  obtains  additional  pates  to  tne  final  solution  ty 
increasing  tne  branening  factor  associated  witn  one  or  more 
instructions.  Tne  deptr  of  a  successful  searcn  into  tne  tree 
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is  restricted  by  tne  Lover  bound  on  tne  number  of  nodes 
required  by  tne  determinis  tic  macnine.  Only  wner.  a  pattern 
of  prefix  assignments  nas  been  made  wnim  allows  the 
algoritnm  to  remain  deterministic  and  all  of  tne 
instructions  in  tne  original  trace  nave  Deen  assigned  prefix 
lapels  will  tne  syntnesis  terminate.  Tnis  mecnanism  prevents 
a  straient-line  model  from  bein<?  output  as  tne  aleorithm 
unless  it  is  tne  only  one  tnat  can  satisfy  tne  input  trace. 
More  importantly,  it  provides  tne  minimum-state 
deterministic  macnine  capable  of  executing  tne  input  trace. 

D.  SYNTHESIZER  STRUCTURE 

Tne  syntnesis  program  is  subdivided  into  two  primary 
modules:  static  processing  of  tne  input  trace;  and  dyr.ami c 
processine  of  tne  information  extracted  from  the  input  trace 
by  tne  preprocessing,  or  static  processing  pnase.  Static 
processine  provides  information  sucn  as  couple-classes, 
difference  sets,  and  lower  bounds  on  tne  number  of  macnine 
states.  Dynamic  processine  uses  trnowiedee  inherited  from 
preprocessing  to  guile  tne  search  mecnanism  to  a  final 
output  of  the  aleorithm.  These  two  modules  will  be  discussed 
in  turn,  and  tne  primary  mecnanisms  involved  will  be 
amplified. 

1 .  Static  Processing 

Static  processine  can  be  conceptualized  as 
consisting  of  three  main  functions:  (a)  accept  tne  input 
trace?  (b)  preprocess  tne  trace  for  difference  sets. 
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couple-classes,  ana  state  bounds;  and  (c)  prepare  a  trace 
table  for  f urtner  use  by  dynamic  processing.  Cnee  tni s 
preprocessing  nas  been  accomplisnel ,  tne  static  module  is  no 
longer  necessary  to  tne  syntnesizer. 

In  tne  current  configuration,  tne  static  module 
expects  to  find  tne  input  as  a  sequence  of 
instruction-condition-instruction  triples.  Figure  14  is  an 
example  of  an  input  trace. 

level  trace 

1  anp 

d  ps  a 

3  aga 

4  ay  r 

b  rs  r 

6  rsr 

7  rra 

8  apa 

9  ay  t 

Flpure  14.  Typical  Input  to  Static  Processor 
Eacn  line  consists  of  a  triple,  for  example  'anp'. 
Tne  'a'  represents  an  Instruction,  tne  'n'  represents  tne 
condition  wnicn  causes  tne  program  trace  to  transition  to 
tne  next  instruction  'p'.  For  eacn  level,  tne  first  element 
represents  tne  same  Instruction  as  tne  last  element  of  tne 
preceding  level.  Tnis  is  easier  to  see  if  tne  above  trace  is 
represented  as  a  Moore  macnine  in  wnicn  tne  nodes  are 
instructions  and  tne  conditions  are  transitions.  State  'a' 
transitions  on  condition  'a '  to  state  'p'  wnicn  transitions 
on  condition  's'  to  state  'a'  wnicn  transitions  on  condition 
bacic  to  state  'a',  etc. 
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Kach  occurrence  of  an  instruction  symbol  in  tne  input  tra~e 
is  represented  Dy  tne  same  state  at  tnis  point  in  tne 
synthesis. 

Once  tne  input  trace  nas  been  accepted,  static 
processing  can  begin.  Static  processing  consists  of 
determining  tne  level  indices  associated  with  each 
couple-class  and  with  eacn  difference  set.  For  tne  trace  of 
Figure  15,  these  are  shown  in  Figure  15. 

Tnere  are  two  couple-classes  in  tnis  trace.  Tnev  are 
[agaj  at  levels  3  and  8,  and  [rsrj  at  levels  5  and  5.  The 
remaining  levels  are  not  assigned  to  a  coupie-ciass  Decause 
no  other  levels  match  with  tnem.  Couple-class  information  is 
useful  to  the  dynamic  processor  for  determining  forced 
assignments  and  dynamic  non-equivalence.  These  ideas  will  ce 
discussed  more  fully  in  tne  section  on  dynamic  processing. 

Difference  sets  exist  for  levels  3  and  4.  Level  4 
nas  a  difference  set  wnicn  contains  tne  index  y;  that  is, 
tne  element  at  level  4,  'ayt',  must  nave  a  different  prefix 
label  on  'a'  tnan  tne  element  at  level  y,  'ayt'.  If  tne  'a' 
is  not  labelled  differently  during  tne  syntnesis, 
nondeterminism  will  result  since  the  same  transition  would 
lead  to  different  nodes. 

Difference  set  resolution  is  a  very  powerful 
mechanism  for  ensuring  deterministic  benavior  of  the 
algoritnm.  A  considerable  amount  of  tne  prefix  laoel 
assignments  to  the  nodes  can  be  resolved  usin*  difference 
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sets.  Notice  tnat  level  y  appears  In  tne  difference  set  for 
level  3  even  tnougn  levels  3  and  e  are  in  tfte  sa-e 
couple-class.  *t  first  tnis  appears  contradictory  since 
equivalent  couple-class  names  imply  tnat  tne  elements  are 
tne  same,  but  difference  set  existence  forces  tne  lead 
instructions  to  be  different.  Tnis  points  out  tne  relative 
power  of  couple-class  information  and  difference  set 
information.  Difference  set  information  Is  immutable. 
Couple-class  information  only  nints  at  equivalence.  In  tnis 
particular  example,  tne  entry  at  level  3  was  caused  by  tne 
cnalnine  effect  of  difference  set  resolution.  Notice  tnat 
since  tne  'a'  at  level  4  must  ne  different  tnan  tne  'a'  at 
level  9,  and  notice  that  since  tne  trailing  'a'  at  level  3 
is,  by  definition,  tne  same  as  tne  leading  'a'  at  level  4, 
tne  trailing  'a'  at  level  3  cannot  be  tne  same  as  tne 
trailing  'a'  at  level  8»  tnerefore,  levels  3  and  e  cannot  be 
In  tne  same  couple-class. 

To  compute  tne  lower  bound  on  tne  number  of  states 
In  tne  algorithm,  tne  minimum  number  of  states  needed  for 
each  instruction  is  summed.  For  tnis  same  example,  tne 
Instruction  set  consists  of  ia,p,r,t}.  Tne  bounds  for  p,r, 
and  t  are  eacn  1.  Tne  bound  for  'a'  is  2.  Tnere  must  be  at 
least  two  different  occurrences  of  'a'  from  the  difference 
set  resolution.  Tnerefore,  tne  minimum  number  of  states  with 
which  a  deterministic  Poore  machine  can  be  constructed  for 
this  trace  is  5. 
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Finally,  static  processing  passes  all  tne 
information  concerning  tne  input  trace  to  tne  dynamic 


processor  via  a  trace  table  in  tne  following  torm.  Eacn 
level  nas  only  one  associated  condition  and  one  associated 
instruction.  Since  difference  set  information  is  associated 
with  tne  lead  instruction  in  an 
instruction-condition-instruction  sequence,  it  is  entered  at 
that  level.  Since  couple-class  information  is  associated 
wltn  tne  entire  instruction-condition-instruction  sequence, 
it  is  associated  with  the  trailing  condition-instruction 
pair . 
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Figure  17.  TraceTacie 

2.  Dynamic  Processing 

Dynamic  processing  involves  assigning  prefix  labels 
to  tne  states  of  the  machine.  In  tnls  way,  separate 
occurrences  of  tne  same  instruction  are  differentiated.  Tne 
dynamic  processor  is  the  search  mechanism  for  the 
syntneslzer.  It  operates  in  sucn  a  way  tnat,  at  any  point  in 
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the  synthesis,  the  portion  or  the  trace  previously  processed 
represents  a  deterministic  Poore  machine.  In  order  to 
maintain  the  determinism,  dynamic  processing  steps  tnroueh 
tnree  pnases:(l)  assignment  of  tne  prefix  iaeei  to  tne 
instruction;  (2)  difference  set  resolution,  and  (3^  dynamic 
equivalence  assurance.  Additionally,  eacn  of  these  pnases 
nave  built  in  fixup  and  bacttup  conditions  associated  witn 
them.  Tne  f  ixup/Dacicup  conditions  encountered  during 
difference  set  resolution  or  during  dynamic  equivalence 
checking  are  indicators  mat,  if  tne  current  assignments 
remain  tne  same,  a  nondeterminisn  will  occur  in  future 
assignments.  As  such,  tney  inform  tne  pruning  mecnanlsms  of 
tne  searcn  algorithm. 

An  integral  part  of  me  dynamic  processor  is  tne 
failure  memory.  It  controls  tne  searcn.  Tne  failure  memory 
may  be  conceptualized  as  a  L  x  P  matrix  wnere  L  is  tne  row 
size  and  corresponds  to  tne  number  of  levels  in  tne  trace. 
Eacn  row  nas  P  columns  wnere  P  is  equal  to  tne  lower  hound 
assigned  to  tne  instruction  contained  on  that  level  of  the 
trace.  An  entry  Into  tne  failure  memory  at  some  level  i  and 
some  column  J,  where  1  <=  i  <=  L  and  1  <=  j  <=  p,  prevents 
the  assignment  of  J  as  a  prefix  label  for  tne  instruction  at 
level  i.  When  a  failure  memory  ceil  contains  an  entry  it  is 
called  a  valid  ceil?  otnerwise  it  is  invalid .  Eacn  ceil  of 
tne  failure  memory  Is  a  two-element  entry.  Tne  structure 
factor  is  the  first  element.  It  indicates  wnicn  level  of  the 
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trace  caused  tne  entry.  Tne  free  state  factor  is  tne  second 
element.  As  tne  name  indicates,  tnis  element  is  a  function 
of  tne  number  of  free  states  available  at  tne  time  of 
assignment.  Tne  specifics  of  tne  failure  memory  operation 
and  tne  nature  of  failure  memory  entries  will  oe  discussed 
throughout  tne  rest  of  the  section  as  each  phase  of  the 
dynamic  processor  is  discussed, 
a.  Label  Assignment 

As  previously  mentioned,  iacei  assignment  is  tne 
first  function  provided  by  tne  dynamic  processor.  A  label 
assignment  can  oe  eitner  forced  or  arbi trary ♦  Additionally, 
the  assignment  can  result  In  the  creation  of  a  new  state,  a 
label-name  combination  not  seen  before.  A  forced  assignment 
occurs  when  the  instruction  at  tne  current  wonting  level  is 
a  member  of  tne  same  couple-class  as  an  instruction  at  a 
prior  level,  and  tne  lead  instruction  into  cotn  of  those 
levels  has  tne  same  label  assignment.  Tne  current  woricir.j? 
level  is  defined  as  tne  level  of  tne  trace  wnlcn  contains 
the  most  recently  asslenei  prefix  label,  but  difference  set 
resolution  and  dynamic  equivalence  cnecicing  nave  not  Deen 
completed  at  that  level.  An  example  is  *iven  in  tne  trace 
shown  in  Figure  18. 

Tne  label  at  level  7  is  forced  by  tne  label 
assignments  at  levels  4  and  5.  Notice  that  the  instructions 
at  level  5  and  at  level  7  are  in  tne  same  couple-class. 
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Figure  IB.  Partial  Trace  Labelling 

and  that  tne  instructions  at  levels  4  and  6  nave  tne  same 
prefix  label.  Tnis  condition  forces  tne  instruction  at  level 
7  to  nave  tne  same  prefix  label  as  tne  instruction  at  level 
b.  The  Moore  machine  representation  of  tne  partial  trace  is 
snown  in  Figure  19.  Tne  assignment  at  level  B  is  also  forced 
for  similar  reasons.  By  definition,  any  forced  assignment 
involves  previously  assigned  states,  label-instruction 
combinations,  tnat  nave  been  seen  before;  therefore,  no 
forced  assignment  can  result  in  a  new  state. 


Figure  19.  Partially  Determined  Moore  Machine 
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Tne  failure  memory  can  oe  used  in  conjunction 
with  forced  assignments  to  signal  a  Dacicup  condition  to  tre 
searcn.  If  tne  failure  memory  entry  corresponding  to  me 
label  assignment  at  tne  current  wording  level  is  valid,  tnen 
a  contradiction  results  from  tne  forced  assignment.  Suppose 
that  the  trace  table  and  failure  memory  are  as  snowr.  in 
Figure  20,  and  tne  forced  assignment  at  level  8  nas  just 
been  made.  Tne  entry  '1.1'  at  row  2,  column  a  of  tne  failure 
memory  is  interpreted  in  tne  following  manner.  Tne  integer 
to  tne  left  of  tne  decimal  indicates  tnat  tne  entry  was 
caused  by  the  current  assignment  at  level  1.  The  'l'  to  tne 
right  of  tne  decimal  point  is  tne  number  of  free  states  +  1 
available  wnen  tne  assignment  at  level  1  caused  tne  failure 
memory  entry;  tnerefore,  wnen  tne  entry  was  made  there  were 
no  free  states  available.  A  free  state  is  one  wnicn  nas  not 
been  bound  to  a  particular  instruction. 

Tne  assignment  at  level  9  is  forced.  In  other 
words  the  sequence  of  the  previous  assignments  causes  tne 
prefix  laoei  of  tne  instruction  at  level  8  to  De  a  2. 
However,  the  failure  memory  contains  an  entry  at  row  9 
column  2,  F!*(9,2)  .  Tnls  entry  indicates  tnat  tne  instruction 
at  level  e  cannot  be  assigned  tne  label  '2',  for  if  it  were 
to  be  assigned  a  '2',  a  nondeterminism  will  result.  To 
resolve  tne  conflict,  bacicup  is  Initiated  until  tne  last 
unforced  assignment  Is  found.  In  tnis  case,  tne  backup  is  to 


level  6 


Tne  assignment  at  level  6  will  oe  cnanged  ana  me  searcn 
will  continue  from  tnere. 


Trace  Table 

level  cond  ins tr  c-c  label 

« 

• 

4  a  a  —2 

bn  r  3  1 

6  r  a  4  2 

7  n  r  3  .1 

a  r  a  4  .2 

• 

Figure  2k).  Trace  Table/Faiiure  Memory  Configuration 
for  a  Forced  Assignment 

If  tne  assignment  is  not  forced,  tne  failure 
memory  row  corresponding  to  tne  current  wonting  level  is 
searcned  for  tne  first  occurrence  of  an  invalid  cell.  An 
invalid  cell  is  one  wnicn  does  not  contain  a  failure  memory 
entry.  If  a  cell  is  invalid,  tne  assignment  of  a  prefix 
label  corresponding  to  the  failure  memory  column  index  for 
that  cell  is  possible  on  mat  level  of  tne  trace.  The  column 
number  of  tne  first  invalid  ceil  becomes  tne  label 
assignment  for  tne  instruction  at  that  level.  For  example, 
suppose  level  5  is  tne  current  wonting  level  and  tne  trace 
table  and  failure  memory  nave  the  configuration  snown  in 
Figure  21. 


Trace  Table 


Failure  Memory 


level  cond  1  nstr  _1  2  _3  4 

5  r  a  l.l  4.1 

Figure  21.  Trace  Table  Entry  Snowing 

Arbitrary  Assignment  tfetnod 

Tne  first  invalid  entry  in  tne  failure  memory  on 
row  6  is  in  column  a;  tnerefore,  instruction  'a'  for  level  6 
will  be  assigned  a  prefix  label  of  3.  Tnese  non-forced 
assignments  may  result  in  tne  creation  of  a  new  state;  that 
is,  a  label-instruction  pair  not  previously  assigned  during 
tne  synthesis.  If,  at  some  future  point  in  tne  searcn,  a 
backup  is  initiated  that  reaches  this  level  of  tne  trace, 
tne  backup  mecnanism  will  not  stop  to  perform  a  retry.  At 
any  point  In  the  synthesis,  all  previous  levels  have 
received  assignments  based  on  the  constraint  that  tne 
minimum  number  of  states  nas  been  used  consistent  with 
maintaining  determinism;  tnerefore,  assigning  a  different 
prefix  label  to  a  state  wnicn  has  been  defined  as  a  new 
state  only  changes  tne  name  of  tne  state,  and  does  not 
change  the  structure  of  tne  algorithm.  Since  tne  structure 
of  tne  algorithm  nas  not  been  cnanged,  the  cause  of  the 
nondeterminism  is  still  present. 

One  other  type  of  assignment  should  be  mentioned 
at  this  point.  Pseudo-assignment  occurs  wnen  tnere  is  only 


one  invalid  cell  left  in  a  failure  memory  row  at  a  level 
otner  tnan  tne  current  wortcing  level  and  tnere  are  no  free 
states  available.  Altnougn  pseudo-assignment  does  not 
immediately  cause  a  label  to  De  assigned  to  tne  instruction 
at  tnat  level,  it  does  simulate  a  looK-anead  mecnanism  for 
tne  searcn  technique  by  triggering  difference  set  resolution 
and  dynamic  eauivalence  cnecKing  as  if  tnat  level  of  tne 
trace  were  assigned  a  value.  Since  tne  pseudo  value  is  tne 
only  value  currently  possible  for  tnat  level,  if  a  bacKup  or 
fiiup  condition  is  encountered  during  pseudo  assignment,  tne 
assignment  mecnanism  can  immediately  try  another  label  at 
tne  current  wording  level#  thereby  saving  tne  unnecessary 
search  of  a  patn  which  it  already  Knows  to  be  nonproductive. 

Once  a  tentative  label  assignment  nas  been  made 
to  tne  instruction  at  the  current  wording  level,  difference 
set  resolution  and  dynamic  equivalence  cnectlng  can  be 
performed.  Althougn  these  actions  may  cause  a  fixup  on  the 
prefix  label  at  tne  current  wonting  level,  tneir  primary- 
purpose  is  to  furnish  information  to  the  failure  memory  that 
will  neip  guide  future  label  assignments, 
b.  Difference  Set  Resolution 

Difference  set  resolution  prevents  future 
assignments  being  made  that  are  Known  to  cause 
nondeterminism  if  the  current  assignments  remain  unchanged. 
Difference  sets  outline  a  significant  portion  of  the 
structure  of  tne  input  trace  witnout  regard  to  label 
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assigned  to  tne  instruction  at  tne  level  from  vnicn  tr.e 
difference  set  is  being  resolved  if  tde  cell  das  not  already 
been  made  valid  through  a  previous  assignment.  For  examplp, 
if  tne  prefix  assignment  at  level  1  is  a  '1',  the  failure 
memory  entries  are  made  in  column  l  at  levels  3,5,15,1?. 


Similarly,  wnen  tne  assignment  '1'  is  made  at  level  2, 
failure  entries  are  male  at  levels  4  and  11.  Now  wren  tne 
assignment  at  level  3  is  made,  tne  dynamic  processor  will 
not  try  to  assign  a  prefix  value  of  'l'  since  tne  failure 
memory  ceil  at  (3,1)  is  valid.  Tne  assignment  will 
automatically  be  '2'.  Notice  taat  at  level  5  tne  previous 
assignments  nave  caused  tne  prefix  label  to  be  a  '3'.  In 
otner  words,  tne  failure  memory  nas  caused  tne  searcn  tree 
to  be  pruned  so  that  an  assignment  of  ' l'  or  ' 2 '  will  not  ce 
tried.  Either  one  of  these  assignments  would  nave  resulted 
in  nondeterminism  being  Introduced  into  tne  trace  at  level 
6. 


Figure  24a.  Prefix  Label  Equals  1 


Fleurs  24b.  Prefix  Label  Equals  2 

Figure  24.  Nondeterministic  Prefix  Label  Assignments 

tfnile  failure  memory  entries  are  being  mane 
under  difference  set  resolution.  It  is  possible  for  a  row  to 


nave  all 

cells  valid  except 

one . 

Tnis  nas  been  previou 

defined  as 

a  situation  leading 

to 

pseudo-assignment.  T 

situation  nas  occurred  at  level  11  in  tne  example  given  in 
Figure  23.  tfhen  sucn  an  occurrence  happens  a  loofr-anead 
mechanism  is  triggered  to  resolve  tne  difference  set  at  tnat 
level.  In  tnls  example,  tne  failure  memory  cell  at  (21,3) 
nas  been  validated  with  an  entry  which  indicates  the  current 
wording  level  as  level  4  wnen  tne  pseudo-assignment  occurred 
at  level  11.  Another  situation  which  can  occur  in  a  failure 
memory  row  is  wnen  all  the  entries  in  the  row  become  valid. 
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This  condition  Is  called  an  incipient  fence,  rfhen  an 
incipient  fence  exists  and  tnere  are  no  free  states 
available,  tnen  no  assignment  can  be  made  at  tnat  level. 
This  condition  is  called  a  fence . 

Since  the  search  mecnanism  always  mows  tne 
level  from  which  it  is  doing  iooK-ar.ead  by  difference  set 
resolution,  it  is  able  to  perform  a  fixup  on  tne  label 
assignment  at  tne  earliest  possible  time.  A  fixup  is 
accomplished  by  Incrementing  tne  prefix  laDei  bv  one.  If  an 
entire  row  in  the  failure  memory  becomes  valid  and  tnere  are 
no  free  states  available  a  fixup  must  be  performed  on  tne 
label  assignment  at  tne  current  wortcine  level.  If  the  label 
is  left  tne  same,  tnen  wnen  tne  search  reaches  tne  fenced 
level,  no  assl?nment  will  be  possible.  Each  time  a  fixup 
occurs,  all  entries  made  in  the  failure  memory  as  a  result 
of  the  previous  label  assignment  are  deleted,  and  entries 
are  then  made  based  on  tne  new  label, 
c.  Dynamic  Equivalence 

Couple-class  information  furnished  by  static 
processing  ails  in  tne  determination  of  dynamic 
nonequivalence.  Dynamic  nonequi valence  can  occur  during  the 
syntnesis  at  any  level  below  tne  current  wording  level  wnen 
tne  couple-classes  are  equal.  Dynamic  equivalence  results 
wnen  instructions  in  tne  same  couple-class  nave  bpen 
assigned  the  same  prefix  label.  Consider  Fieure  25.  The 
I-C-I  triples  at  levels  5  and  6  and  at  levels  11  and  12  are 
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lasra] »  therefore,  tney  are  in  the  same  couple-class.  The 
instruction  ' a '  at  level  5  nas  been  assigned  a  prefix  cf 
'2 ' ,  and  tne  instruction  'a'  at  level  5  nas  Deen  assigned  a 
prefix  of  'l'.  Now,  if  tne  instruction  at  level  11  is 
assigned  a  prefix  of  ' 2 '  and  tne  instruction  at  level  12  is 
assigned  a  prefix  of  'l',  dynamic  equivalen-e  will  o^cur. 
Furtner,  tne  assignment  at  level  12  will  be  forced.  Dynamic 
non-equivalence  results  wnen  such  an  assignment  scheme 
causes  non-determinism.  Dynamic  equivalence  checking 
functions  as  a  looic-anead  mechanism  by  preventing  tne  future 
occurrence  of  a  forced  assignment  wnicn  will  result  in 
nondeterminism.  Suppose  tne  syntnesizer  is  inspecting  tne 
trace  in  Figure  6,  and  nas  Just  assigned  tne  instruction  at 
level  ti  a  prefix  of  'l'. 

Notice  that  level  12  is  in  tne  same  couple-class 
as  level  6.  Since  tne  instruction  at  earn  of  tnese  levels  is 
in  tne  same  couple-class,  the  possibility  exists  tnat  tney 
may  be  tne  same  instruction.  If  tne  instruction  at  level  11 
is  assigned  a  label  of  '2'  wnen  tne  working  level  reac.oes 
tnat  part  of  the  trace,  then  the  assignment  at  level  12  will 
be  a  forced  assignment  of  'l' .  However,  an  entry  nas  already 
been  made  in  tne  failure  memory  at  (12,1 >  which  indicates 
that  the  instruction  at  level  12  cannot  be  assigned  a  prefix 
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Figure  25.  Trace  TaDle/Fai lure  Memory 

In  order  to  avoid  tnis  contradiction  and  a 
baclcup,  dynamic  nonequivalence  processing  causes  an  entry  at 
(11,2)  of  tne  failure  memory  vnlcn  corresponds  to  tne 
labelling  of  '2'  given  to  tne  instruction  at  level  5.  Onc° 
tnis  Is  accomplisned ,  wnen  tne  worslng  level  descends  to 
level  11,  an  assignment  of  '2'  rannot  be  made  and  as  a 
result,  tne  assignment  at  level  12  will  no  longer  re  <'orcea 
by  dynamic  equivalence  vnlcn  fives  tne  synthesizer  a  cnance 
to  try  otner  assignments  tnat  will  maintain  determinism  rf 
tne  algorithm. 

Pseudo-assignment  conditions  and  fixup 
conditions  can  occur  in  the  failure  memory  as  3  result  of 
validation  of  all  but  one  of  tne  failure  memory  cells  in  a 
row  In  the  same  manner  tnat  they  occur  In  difference  set 
resolution.  Additionally,  dynamic  equivalency  and  difference 
set  resolution  can  Interact  to  cause  failure  memory  entries 
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in  the  following  manner.  If  a  failure  memory  entry  is  made 
by  difference  set  resolution  at  any  level  wnicn  is  in  tne 
same  couple-class  as  a  level  previously  assigned  a  profix 
label,  and  if  tne  failure  memory  entry  prevents  tne 
assignment  tnat  will  cause  the  instructions  tc  become  part 
of  tne  same  state,  tnen  dynamic  nor  equivalence  will  result; 
tnerefore,  an  entry  must  be  made  in  tne  failure  memory  to 
indicate  tnis  condition. 

3.  Bacttup/Flxu  p 

Tne  discussion  of  backup  and  fixup  conditions  nas 
been  saved  until  last.  Tne  basic  idea  behind  constructing 
tne  syntnesizer  is  to  provide  as  men  information  as 
possible  to  the  search  mechanism,  and  thereby  direct  the 
label  assignment  vitn  a  minimal  number  of  retries,  with  tnis 
in  mind  bacicup  and  fixup  become  last  resorts. 

The  fixup  operation  attempts  to  resolve 
nondetermlni sm  by  incrementing  tne  label  at  tne  current 
wortring  level  wnen  a  contradiction  occurs.  If  the  newly 
incremented  label  is  not  a  legal  assignment  or  does  not 
correct  tne  contradiction,  tnen  bacicup  must  be  initiated. 
Tne  fixup  operation  cannot  be  attempted  if  tne  assignment  at 
tne  current  wortting  level  is  forced  or  if  the  assignment 
created  a  new  state.  In  eitner  of  tnese  cases,  a  fixup 
operation  would  leave  nondeterminism  in  tne  aigorltnm. 

If  a  fixup  fails,  or  cannot  be  attempted,  bacicup  is 
Initiated.  Bacxup  must  be  initiated  from  tne  current  woming 
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level  wnen  any  level  is  discovered  wnich  contains  one  of 
these  conditions: 


1)  Tne  label  assignment  is  forced  and  tne  failure  memory 
cell  corresponding  to  tnat  level  and  label  is  valid. 

2)  Tne  label  assignment  causes  a  contradiction  and 
represents  a  new  state,  or 

3)  There  is  no  free  state  available  for  tne  instruction 
at  a  particular  level,  and  all  entries  in  tne  failure 
memory  row  at  that  level  are  valid. 

Tne  bactrup  begins  at  tne  current  wcriting  level  regardless  of 
wfcicft  level  triggered  the  mechanism,  ana  continues  until 
none  of  the  three  conditions  given  above  are  present.  At 
tnat  level  a  fixup  operation  is  attempted  ana  tne  searcn 
begins  anew.  Any  entries  into  tne  failure  memory  which  were 
caused  by  levels  greater  tnan  or  equal  to  tne  new  current 
wonting  level  are  invalidated  by  resetting  the  failure 
memory  entries  to  (0,0).  Additionally,  any  assignments  are 
deleted  along  with  their  side-effects,  such  as  annotations 
on  forced  assignments  and  new  states.  If  oacicup  causes  tne 
wonting  level  to  be  decremented  to  zero,  a  free  state  is 
added  for  the  use  of  tne  first  instruction  needing  ^cre 
states  than  initially  allotted  as  tne  lower  bound. 
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III.  PREPROCESSOR 

A.  PRORLE*  SPECIFICATION 

The  program  synthesizer  expects  a  set  of  triples  where 
each  triple  is  an  Instruction,  a  condition,  and  an 
instruction.  Siermann  [2J  nas  snown  teat  conditions 
inadvertently  or  purposely  omitted  bv  tne  user  may  re 
inserted  into  a  trace.  The  algorithm  for  insertion  of 
conditions  collects  tne  set  of  atoms  seen  on  the  transitions 
for  an  instruction.  An  a^on  is  an  entity  wnicn  nas  a  value 
of  either  'true'  or  'false'.  A  condition  is  composed  by 
logical  conjunction  and  disjunction  operations  on  atoms.  For 
example,  an  atom  may  be  'c  <*  0' ,  but  a  condition  may  be  'o 
<*=0  end  a  *  4 ' .  A  set  of  min  terms  is  computed  from  tne  set 
of  atoms  and  one  of  tne  minterms  is  inserted  after  eacn 
occurrence  of  that  instruction  in  tn?  trace.  If  la,bj  Is  a 
set  of  atoms,  tnen  tne  set  of  mmterms  will  te 

i {a , b> , {-a , b >, {a , -b) , {-a , -b} }  wnere  -  stands  for  lo^i^al 
negation.  It  nas  been  shown  in  reference  [16J  tnat  only  one 
of  the  minterms  can  be  true  for  ea^n  occurrence  of  a 

transition  from  any  single  instruction. 

One  problem  wltn  tne  algorithm  is  tnat  it  is  incapable 
of  Inserting  conditions  if  tne  user  nas  failed  to  supply  ary 
atoms  after  a  particular  instruction.  For  example,  if  tne 
user  should  specify  Instruction  Ii  followed  by  instruction 
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12  In  one  part  of  tne  trace  ana  instruction  II  followed  by 

13  in  another  part  of  tne  trace,  but  tne  user  fails  to 
provide  a  condition  after  eitner  occurrence  of  II,  then  tne 
algorithm  will  be  unable  to  generate  a  condition  for  II.  It 
is  assumed  tnat  II  does  not  appear  witn  an  atom  eisewnere  in 
tne  trace.  Tne  synthesizer  will  force  two  states  for  II  to 
resolve  any  nondeterminism .  Tnis  mecnanism  is  fully 
explained  in  Section  II.  If  conditions  nad  been  supplied  in 
tne  above  example,  tne  difference  in  tne  two  programs  would 
be  tne  number  of  states  assigned  to  instruction  Ii.  Figure 
2b  snows  a  partial  computation  without  explicitly  expressed 
conditions  along  witn  tne  associated  syntnesized  program 
fragment.  Figure  25  assumes  that  II  does  not  appear 
eisewnere  in  tne  trace.  Figure  2?  is  a  representation  of  tne 
same  partial  computation  except  tnat  tne  conditions  cl  and 
c2  have  been  explicitly  expressed.  Tne  computations  in  both 
figures  are  tne  same,  and  eac.n  program  fragment  will 
correctly  execute  eitner  trace;  therefore,  the  programs  must 
be  equivalent  programs  with  respect  to  program  benavior. 
However  the  program  in  Figure  27  is  minimal  in  that  it 
contains  fewer  states  because  tne  user  explicitly  supplied 
tne  conditions. 
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(S,... ,11,12 . 11,13, ...H) 

Example  Computation 


Syntaesized  Prorram 

Figure  26.  Computation  witnout  Explicit  Conditions 

(S,. ... ,Il,Cl, 12,... ,11,13,. ..,H) 

Example  Computation 


Synthesized  Program 

Figure  27.  Computation  with  Explicit  Conditions 


We  intend  to  show  tnat  there  are  mechanisms  wni'-n  '■an  he 
used  to  automatically  generate  tne  necessary  conditions  for 
the  correct  synthesis  of  an  algorithm  produced  by  an  example 
computation  witnout  tne  user  explicitly  defining  them.  Tne 
problem  may  be  described  as  follows.  Given  an  example 
computation  without  explicitly  defined  conditions,  infer 
those  conditions  necessary  to  control  tne  flow  of 
computation  in  a  manner  such  that  tne  synthesized  program 
will  demonstrate  tne  benavior  desired  by  tne  user.  In  order 


76 


to  facilitate  the  solution  to  tne  problem,  a  conaition  will 
be  viewed  as  a  function  tnat  returns  a  value  of  'true'  or 
'false'  when  called  ratr.er  tnan  a  loe-icai  operation  on 
atomic  boolean  entitles.  Tne  problem  can  tnen  be  thought  of 
as  constructing  a  function. 

Very  little  information  is  available  to  tne  current 
version  of  tne  synthesizer  when  tne  user  provides  only  a 
sequence  of  instructions.  Certainly  not  enougn  to  generate 
minimal  programs  as  described  in  Figure  27.  Tnis  led  us  to 
searcn  for  other  sources  of  information  that  would  allow  us 
to  construct  tne  necessary  conditions.  We  soon  realized  tnat 
the  instructions  issued  by  the  user  do  not  exist  in  a 
vacuum.  These  instructions  manipulate  data.  If  tne  entire 
computer  memory.  Including  registers,  is  viewed  as  tne 
domain  of  interest,  then  execution  or  an  instruction  always 
cnanges  tnis  state.  Intuitively,  tne  domain  also  reflects 
the  reason  that  the  user  decided  to  execute  a  particular 
instruction.  A  search  of  a  space  of  tnis  size  in  order  to 
determine  tne  reason  is  impractical;  however,  observing  only 
those  data  elements  affected  by  tne  sequence  of  instructions 
can  often  be  quite  practical  and  can  significantly  reduce 
tne  search  space. 

We  cnose  tne  text  editing  domain  as  tne  domain  of 
interest  since  we  felt  that  it  would  be  sufficiently 
interesting  to  warrant  application  of  svnt.nesis  techniques. 
This  domain  was  selected  because,  first,  tecnniaues 
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developed  in  tnis  domain  may  be  general  enougn  for  extension 
into  otner  domains,  secondly,  the  world  for  tnis  domain  can 
be  described  as  tne  set  of  all  characters  contained  in  a 
particular  text  file  wnlcn  maxes  tne  world  finite,  and 
finally,  tne  instruction  set  is  small  enough  to  he 
managea  bie. 

Altnouga  our  primary  research  is  directed  toward 
studying  techniques  to  apply  to  automatic  condition 
generation,  we  feel  that  tne  synthesizer  could  be  a  powerful 
text  editor  and  could  provide  some  useful  features  not 
normally  seen  in  conventional  text  editors.  Extended 
features  could  include  tne  ability  to  capitalize  tne  first 
letter  of  every  sentence,  the  ability  to  capitalize  all 
small  letters  in  tne  text,  tne  ability  to  identify  a  string 
and  perform  some  operation  before,  after  or  on  it  ,  or  a nv 
combination  of  tnese  editing  actions. 

Tne  worting  nypotnesis  is  to  nave  tne  user  process  tne 
text  file  in  a  normal  manner  and  have  tne  syntnesizer  infer 
a  program  from  nis  actions.  Two  requirements  were  levied 
upon  tne  user.  Tne  first  requirement  on  tne  user  is  tnat  ne 
must  inform  tne  syntnesizer  wnen  ne  desires  to  nave  a 
program  generated  so  tnat  tne  syntnesizer  can  tefin 
monitoring  tne  user's  actions.  A  great  deal  of  time  was 
spent  trying  to  figure  out  metnods  tnat  allowed  one  general 
mechanism  to  be  used  to  monitor  tne  user's  actions  and  tne 
resulting  cnanges  in  tne  text  file.  Since  we  could  r.ot 
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produce  such  a  mechanism,  a  second  reauirement  was  levied  on 
tne  user.  This  requirement  recognizes  a  basic  distinction 
between  two  different  aspects  of  text  editing:  context  free 
substitutions,  ana  context  sensitive  substitutions.  tf® 
define  a  context  free  environment  to  De  one  in  wnicn  tne 
cnaracter  to  oe  operated  upon  is  not  dependent  on  characters 
around  it.  Capitalizing  all  occurrences  of  small  letters  is 
an  example  of  a  context  free  operation.  A  context  sensitive 
operation  is  defined  as  an  operation  in  wnicn  tne  action  to 
be  performed  on  a  cnaracter  or  sequence  of  cnaracters 
depends  upon  otner  cnaracters  around  tne  main  character  of 
interest.  Capitalizing  the  first  letter  of  every  sentence  is 
a  context  sensitive  operation.  Condition  inference  in  a 
context  sensitive  environment  is  innerentiy  more  difficult 
tnan  in  a  context  free  environment  in  that  the  condition 
must  be  constructed  from  events  wnicn  require  a  looic-anead 
capability  not  inherent  in  the  synthesizer.  The  user  will  te 
free  to  switch  from  environment  to  environment  at  nis 
convenience.  The  synthesizer  will  create  program  segments 
from  each  environment  wnicn  can  be  used  to  construct  a 
complete  program  by  a  post-processor. 

B.  DESIGN  FOR  A  CONTEXT  FREE  ENVIRONMENT 
1 .  Overview 

Programs  tnat  operate  on  a  single  entity  ran  *e 
constructed  by  the  synthesizer.  Figure  snows  tne 
construction  of  a  program  from  a  trace  intended  to 
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communi  ca  te  tnat  tne  letter  "d  *  snould  be  capitalized 
wnerever  it  appears  in  tne  text  file.  Tne  column  labelled 
'trace'  contains  triples  of  tne  form  instruction,  condition, 
instruction.  B  is  tne  start  instruction,  E  is  tne  mov°  rigrt 
instruction,  C  is  tne  capitalize  or  cnange  instruction  ar.d  S 
is  tne  stop  instruction,  respectively.  Tne  conditions  for 
tnis  trace  are  tne  cnaracters  seen  in  tne  text  file  prior  to 
tne  execution  of  tne  second  instruction  in  earn  triple.  Tne 
special  condition  "0"  is  tne  null  condition,  and  is  alwavs 
inserted  after  tne  start  instruction. 

Tne  generated  program  will  correctly  execute  tne 
trace  that  was  used  to  construct  it,  and  by  examination  of 
tne  program  it  can  be  snown  tnat  tne  program  will  ronvprt 
all  d's  to  D's  in  a  text  file  consistin'?  of  tne  cnaracters 
A,  b,  C,  i,  F  and  S.  Tnere  are  no  arcs  available  for  otr.°r 
cnaracters  in  tne  cnaracter  set.  In  order  to  venerate  a 
program  to  perform  tne  same  function  on  ar.  arbitrary  text 
file,  tne  user  would  be  forced  to  give  ar.  example  of  tne 
desired  transition  for  every  character  in  tne  cftaraoter  set. 

Since  it  is  desirable  to  relieve  tne  r.«er  of  tne 
cnore  of  providing  an  inordinate  number  of  examples  in  order 
to  completely  specify  tne  function,  a  metnod  is  required 
that  utilizes  a  few  examples  of  tne  types  of  conditions  tnat 
are  to  appear  on  tne  arcs  to  generalize  tne  conditions  into 
a  more  compact  and  complete  form.  If  a  generalization  can  be 
found,  tne  multiple  arcs  may  be  replaced  witn  a  more  general 
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condition  and,  therefore,  correct  programs  can  oe  created 
witn  fewer  examples.  However  tne  combination  of  arcs  ceiween 
nodes  must  be  a ccompiis ned  so  tnat  determinism  is  maintained 
or  tne  syntnesizer  will  not  create  a  miminum  state  macnine 
capaoie  of  performing  tne  desired  function.  Tnat  means  tnat 
tne  generalization  tecnnique  must  oe  able  to  r.andle 
conflicts  properly.  Tne  arcs  in  Flrure  2B  tnat  originate  at 
state  R  and  terminate  at  state  R  appear  to  "onsist  of 
elements  from  tne  capital  letters  and  small  letters.  Tne 
generalization  of  i x '  x  g  capital  letters)  U  i z !  z  <  s~all 
letters)  would  appear  to  be  a  reasonable  replacement  for  all 
of  tne  R  to  R  arcs.  If  tnis  generalization  was  made  a 
conflict  would  result  because  tne  letter  '  \'  is  also  an 
element  of  tne  {z !  z  €  small  letters). 


Trace 

B  0  R 
R  A  R 
R  b  R 
R  C  R 
P.  d  C 
C  D  R 
R  F  P 
R  G  R 
R  0  S 


Syntnesized  program 


Figure  28.  Syntnesizer  Action 


Structu  re 


Tne  preprocessor  is  designed  to  accumulate  Knowledge 


from  tne  traces  it  is  provided,  men  use  tne  Knowledge  to 
construct  meaningful  conditions.  Tne  preprocessor  scans  tne 
input  trace  looKing  at  tne  Instructions  and  cnaracters  tnat 


are  seen  before  tne  instructions.  T.nis  pr.ase  extracts  pairs 
of  instructions  from  tne  trace.  Tne  trace  in  Figure  20  would 
nave  tne  instruction  pairs  (isR),  ( R ,  R  )  ,  (R,C'  and  ( C  , F  1 
extracted.  Attacned  to  eacn  of  tnese  pairs  is  tne  set  cf 
cnaracters  tnat  were  seen  between  tne  pair.  Tne  preprocessor 
tnen  analyzes  tne  information  to  determine  if  a 
g-eneraii  zati  on  can  be  made  from  tne  set  of  cnaracters 
associated  witn  eacn  instruction  pair. 

Tne  natural  division  mentioned  above  allows  tne 
preprocessor  tc  be  divided  into  two  modules.  The  first 
module  performs  tne  scanning  function  wniie  tne  second 
module  analyzes  tee  information  and  applies  a  neuristi'*  to 
provide  tne  most  general  condition  possible.  Tne 
implementation  of  tne  preprocessor  will  be  discussed  later, 
but  before  it  can  be  discussed  an  explanation  of  tne  data 
structures  required  by  tne  preprocessor  is  needed. 

6.  Preprocessor  Data  Structures 

To  simplify  tne  problem  we  define  two  tvpes  of 


instructions  in  tnis  domain.  Instructions  tnat  specify  tne 
current  location  of  interest  are  cursor  positioning 


.  Instructions  tnat  change  tne  state 


domain  are  data  manipulation  instructions.  Tne  preprocessor 
accepts  as  input  a  sequence  of  instructions  and  an 


associated  sequence  of  cnaracters.  Tne  first  instruction  in 


tne  Instruction  sequence  is  always  tne  start  instruction 
wnlcn  does  not  nave  a  cnaracter  associated  witn  it.  The  last 
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instruction  in  tne  sequence  is  always  a  nait  instruction. 
Every  action  performed  by  tne  user  is  matured  am  apcfried 
to  tne  instruction  sequence  list.  Tne  cnaracter  smuence  is 
created  in  narmony  with  tne  instruction  seouence.  In  tee 
quiescent  state  tne  cursor  will  indicate  a  certain  position 
in  the  text.  When  the  user  performs  some  action  sucn  as  move 
the  cursor  right,  a  monitor  picKS  up  tne  value  ir.  tne  old 
position  and  associates  tnat  value  witn  tne  instruction 
executed  by  tne  user.  For  example,  assume  a  user  has  a  text 
file  in  lower  case  letters  tnat  ne  wants  to  cnan^e  to  ail 
upper  case  letters.  Tne  user  initiates  the  synthesizer  then 
proceeds  across  tne  line  of  text  cnanging  lower  case  letters 
to  upper  case  letters.  For  the  purpose  of  this  example, 
assume  tne  line  of  text  is  "change  lower  case  to  upper 
case".  As  the  user  roves  across  the  line  mafcinr 
substitutions,  tne  condition  monitor  captures  the  actions 
performed  and  the  characters  seen.  Tne  example  line  would 
yield  an  instruction  sequence  of  (E,  C,  ?.,  C,  S,  C,  C. 
....  C,  S).  Tne  associated  cnaracter  sequence  would  be;  (c, 
C,  n,  H,  a.  A,  ...»  e,  0).  The  ”c”  and  ”R”  in  the 
instruction  sequence  are  tne  capitalize  and  move  right 
instruction,  respectively.  Note.  tnat  tne  capitalize 
instruction  does  not  reposition  tne  cursor  and  wnen  tne  user 
moves  tne  cursor  to  tne  rignt,  tne  result  of  tne  capitalize 
instruction  is  associated  witn  the  move. 
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Anotner  data  structure  needed  cy  tne  preprocessor  is 
the  ASCII  vector.  Tne  ASCII  vector  is  a  128-byte  linear 
array  witn  indices  r.umoered  0  tnroui?n  12?.  Eacn  byte  in  tne 
array  is  referenced  cy  tne  decimal  value  or  a  particular 
ASCII  cnaracter.  For  example,  tne  array  element  r°serv»o  for 
tne  ASCII  cnaracter  '0'  is  indexed  by  4“  decimal.  Tne  arrav 
element  reserved  for  tne  ASCII  cnaracter  'a'  is  indexed  ny 
66  decimal.  Tne  vector  defines  a  partition  of  tne  ASCII 
cnaracter  set  by  usin*  tne  followin?  technique.  Tre  ASCII 
cnaracter  .set  nas  been  divided  into  eight  mutually  exclusive 
suDsets . 

Subset  0  Capital  letters 

Subset  1  Small  letters 

Subset  2  Numbers 

Subset  3  space  character  <s-p> 

Subset  4  Symbols 

Sunset  5  Punctuation 

Subset  5  Arithmetic  operators 

Sunset  ?  Control  characters 

The  sunset  name  is  entered  into  tne  ASCII  vector  at  eacn 
cell  by  converting  tne  ASCII  character  to  its  decimal 
equivalent  and  using  tnat  value  as  tne  array  index.  Tne 
default  partition  is  shown  in  Fieure  29. 
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ASCII  0  1  ...  9  A  B  ...  Z 

Figure  29.  ASCII  Vector 

Tne  cnaracter  set  nierarcny  is  defined  by  tne  tree 
structure  in  Figure  30.  Tne  tree  is  related  to  tne  ASCII 
vector  tnrougn  tne  cnaracter  subset  names  contained  on  eacn 
node  one  level  above  tne  leaf  nodes.  For  tne  default 
nierarcny  snown  in  Figure  30,  a  zero  would  be  entered  in  tne 
ASCII  vector  for  all  capital  letters,  and  a  1  would  be 
entered  for  all  small  letters.  If  a  different  partition  of 
tne  cnaracter  set  is  required  tne  user  can  modify  tne 
nierarcny  or  create  nis  own.  An  example  will  be  given  to 
explain  now  tne  modification  may  ce  accomplisned .  Assume  a 
partition  is  desired  wnere  tne  vowels  are  isolated  into  a 
set.  Assume  furtner  tnat  tne  tne  vowels  are  to  be  subdivided 
into  capital  vowels  and  small  vowels.  The  nierarcny  would  be 
modified  bv  placing  a  son  called  'vowels'  on  tne  alpncDPtic 
node.  Attach  to  tne  new  node  two  sons,  "ailed  'Cap-vowels' 
and  'Small-vowels',  witn  arcs  to  tne  appropriate  cnaracters. 
Relabel  tne  nierarcny  so  tnat  sibling  relations  are  numbered 
In  increasing  order.  Finally,  initialize  tee  ASCII  vector 
witn  tne  new  labelling.  All  of  tne  modifications  can  be  done 
by  tne  system  when  tne  user  calls  for  tne  modification  Tne 
modified  nierarcny  is  snown  in  Figure  31. 
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The  next  lata  structure  uses  ry  tne  preprocessor  is 
the  transition  table.  Tne  transition  table  contains  tne 
fcnowledge  gleaned  from  scanning  tne  instruction  sequence  and 
tne  cnaracter  sequence  created  Dy  tne  monitor,  Figure  22 
snows  tne  format  of  tne  transition  table.  Tne  transition 
table  is  an  array  of  records  with  eacn  record  containing 
information  on  a  transition.  In  the  table,  II  am  12  are 
instructions  wnere  12  directly  follows  II  in  at  least  one 
place  in  tne  instruction  sequence.  'Active-sets'  is  a  field 
tnat  contains  information  on  sets  of  characters  tna t  nave 
Deen  o bserved  by  tne  monitor  on  tne  transition  from  Ii  to 
12.  The  fields  'Set-1'  tnrougn  'Set-n'  contain  tne  value  for 
set  name,  tne  count  of  tne  elements  from  tne  set  associated 
wltn  tne  transition  and  a  pointer  to  a  tinned  list  of  tne 
elements.  Tne  records  that  would  be  created  for  tne  tra-e 
given  in  Figure  28  would  oe  associated  witn  tne  transitions 

B  to  R,  H  to  3,  H  to  C,  C  to  R  and  R  to  S. 

!  Il  !  12  !  Active -sets  !  bet-l  !  Set -2  !  ...  !  Set-n  ! 

lit  I  I  II  I 

III  <1111 


I  •  I 

I  *  | 

i  *  i 


Figure  32.  Format  of  tne  Transition  Table 
4.  Implementation 

Tne  context  free  preprocessor  consist  of  two  main 
modules;  tne  scanner  and  tne  insertion  modules.  Anotner 
important  module  not  part  of  tne  preprocessor  is  the  user 
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monitor.  Tns  monitor  gamers  tne  actions  of  tne  user  and 
creates  two  arrays.  One  array  contains  tne  sequence  of 
instructions  tne  user  provided  and  tne  otner  contains 
information  of  wnat  was  true  before  an  instruction  was 
executed.  Tne  information  tnat  is  garnered  is  tnen  passed  to 
tne  appropriate  preprocesso r . 

Tne  example  instruction  and  cr.aracter  sequences 
given  in  Fisrure  33  will  be  tne  example  used  to  explain  tne 
mecnanism  of  tne  preprocessor.  Figure  33  is  illustrative  of 
a  collection  of  actions  tnat  were  performed  by  some  user. 
Tfte  user's  goal  is:  Cnange  all  lower  case  letters  in  a  text 
file  into  upper  case  letters.  Tae  user  nas  activated  tne 
condition  monitor,  positioned  tne  cursor  at  tne  beginning  of 
a  line  of  text  and  moved  rignt  along  tne  line,  changing  me 
lower  case  letters  to  upper  case  wnenever  one  appeared  above 
tne  cursor.  Figure  33  is  an  example  of  output  from  tre 
monitor  assuming  tne  line  tne  user  processed  was  "Tne 
numbers  1,  2,  3,  b,  7  ARE  prime.".  Tne  first  column  m 
Figure  33  is  tn°  cnaracter  array.  It  contains  tne  cnaracter 
under  tne  cursor  prior  to  execution  of  tne  instruction  in 
column  two.  Column  two  is  a  trace  of  tne  actions  performed 
by  the  user.  Tne  "r"  represerts  tne  "move  cursor  rignt" 
Instruction  and  tne  ”c"  represents  a  cnange  without  cursor 
reposition  instruction.  Figure  33  can  be  read  as:  Tne 
cnaracter  in  column  one  was  observed  and  tne  instruction  in 
column  two  was  executed. 
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Figure  33.  Monitor  Output 
Tne  scan  module  oi  cue  preprocessor  is  activated 
vnen  tne  user  indicates  tne  representative  example  is 
complete.  Let  'inst-inlex'  oe  an  index  for  tne  instruction 
array  tnat  is  initialized  to  1.  Tne  first  step  is  to  create 
a  transition  from  tne  start  instruction  to  tne  first 
instruction  in  tne  instruction  array  and  add  tne  transition 
to  tne  transition  taoie.  Tnls  transition  will  indicate  tne 
beeinnine  of  tne  program  and  will  transition  to  tne  first 
Instruction  provided  on  a  null  condition.  Tne  module  tnen 
moves  down  tne  instruction  array  creatine  otner  transitions 
and  adding  tnem  to  tne  transition  table.  Duplicate 
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transitions  will  not  appear  in  tne  table.  A,  transl  tlon  is 
defined  as  a  pair  (Ii,I2),  II  and  12  are  instructions  and  12 
follows  II  within  tne  instruction  array.  Tne  instruction 
array  in  Figure  33  yields  transitions  (fi,C),  (C,R), 

Tne  transitions  are  constructed  by  indexing  tnroug.n 
tne  instruction  array.  Tne  instruction  at  lnst-mdex  and 
inst-index  *  1  form  a  transition.  Tne  transition  is  tne 
maicn  against  tne  transition  table.  If  a  maten  occurs,  tne 
character  in  tne  character  array  at  inst-index  +  1  is 
extracted  and  its  ASCII  value  is  used  to  index  into  tne 
ASCII  vector.  Tne  value  stored  in  tne  ASCII  vector  is  used 
as  an  exponent  for  two  and  stored  in  a  temporary  variable.  A 
bit  by  bit  logical  OR  is  performed  between  tne  temporary 
variable  and  tne  Active-sets  variable  for  tne  transition  and 
tne  result  is  stored  in  Active-sets.  Active-sets  contains 
tne  information  of  every  set  from  the  partition  that  nas 
elements  seen  on  tne  transition.  Tne  operation  described 
above  allocates  one  bit  for  eacn  set  in  tne  partition.  If 
Active-sets  equals  1  tnen  bit  one  of  Active-sets  is  a  1 
signifying  at  least  one  element  of  set  1  nas  been  seen  cn 
this  transition.  A  two  would  signify  tnat  see  element  of 
set  two  nad  been  seen  and  a  three  would  signify  tnat  some 
element  of  set  one  and  some  element  of  set  two  nad  been 
seen. 

In  the  transition  table  are  fields  for  each  set  that 
has  been  determined  to  be  active  for  tne  transition.  Within 
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eacn  of  tne  set  fields  tnere  are  tnree  subfieids,  tne  first 
is  the  set  name,  the  second  is  a  count  of  the  elements  seen 
for  tne  set  and  tne  last  is  a  pointer  to  tr.e  start  of  a 
circularly  linited  list  containing  the  elements  used  from  tne 
set.  Tne  value  tnat  was  obtained  from  tne  ASCII  vector  is 
used  as  a  set  name  and  matcnei  against  eacn  of  tne  set 
fields'  set  name.  If  the  set  name  matches  an  entry  the 
character  at  inst-index  +  1  is  added  to  tne  linked  list  in 
lexicographical  order  if  not  already  on  tne  list  and  tne 
count  is  incremented  by  one.  If  a  matcn  does  not  occur  on 
tne  set  name  a  new  set  field  is  created  and  ^iven  tne  name 
tnat  was  obtained  from  tne  ASCII  vector,  tne  count  is  set  to 
one,  and  tne  character  is  put  on  tne  list. 

When  the  scan  module  reaches  the  end  of  the  input, 
tne  transition  table  contains  an  entry  for  eacn  transition 
that  was  seen.  Eacn  transition  is  associated  with  all  tne 
sets  tnat  nad  elements  seen  wltn  tne  transition.  Finally 
eacn  transition  is  associated  wltn  tne  actual  elements 
through  tne  United  list  for  each  set.  The  information  is 
tnen  passed  to  tne  insertion  module  for  analysis.  Figure  34 
snows  the  completed  transition  table  and  tne  United  list  of 
elements  for  eacn  set. 

Once  a  completed  transition  table  has  been  created, 
control  Is  passed  to  tne  insertion  module.  Tne  Insertion 
module  processes  the  Information  in  the  transition  tacie  and 
assigns  a  condition  for  eacn  transition. 
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NOTE:  Tne  notation  <1>,  <2>,  etc.  represents  a  pointer  to 
tne  llnsed  list  beaded  by  tne  same  symbol. 

Figure  34.  Completed  Transition  Table 
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The  Active-sets  entries  provide  an  efficient 
mecnanism  for  recognizing  potential  conflicts  on  emanating 
arcs.  Performing  a  bit  by  bit  AND  on  tne  Active-sets  entries 
that  nave  a  common  originating  intruction  yields  tne  source 
of  conflicts.  Tne  bit  positions  tnat  are  on  (bit  equals  1) 
are  tne  set  (or  sets)  tnat  nave  r.ad  elements  on  multiple 
transitions.  For  example,  let  (11,12)  and  (11,13)  De  entries 
in  tne  transition  table  with  Active-sets  value  of  five  (£101 
binary)  and  tnree  (0011  binary)  respectively.  Let  Q  equal 
tne  result  of  tne  Dit  by  bit  AND  of  tne  Active-sets  values 
given  above  (i.e.  0001).  Q  indicates  tnat  tnere  is  a 
conflict  between  tne  transition  (11,12)  and  tne  transition 
(11,13).  Furthermore,  3  indicates  tnat  tne  set  causing  tne 
conflict  is  labelled  zero  in  tne  hierarchy  of  Figure  30 
because  tne  on  bit  is  in  tne  rignt  most  position  wnicn 
corresponds  to  two  raised  to  tne  zero  exponent.  Using  the 
exponent  to  enter  tne  nierarcny,  it  can  be  determined  tnat 
capital  letters  were  seen  on  both  transitions.  Once  all  tne 
conflicts  for  transitions  with  tne  same  originating 
intruction  are  Known,  tne  conflicts  must  be  resolved  before 
an  assignment  of  conditions  can  be  made. 

Extending  tne  example  giver,  above,  assume  tnat  eignt 
capital  letters  were  seen  on  transition  (11,12)  and  four 
capital  letters  were  seen  on  tne  transition  (11,13).  A 
partial  condition  can  be  constructed  for  the  transition 
(Il,I2)  as  a  set  difference  between  tne  set  of  capital 
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letters  and  tne  actual  el  events  seen  on  tne  transition 
(11,13).  Tne  partial  condition  for  tne  (11,13)  transition 
becomes  tne  set  of  capital  letters  tnat  were  actually  seen 
witn  tnis  transition.  Tne  initial  conditions  for  these 
transitions  Become  tne  union  of  tne  sets  indicated  in 
Active-sets  as  not  being  in  conflict  and  tne  sets  created  by 
tne  resolution  of  conficts.  Tnerefore,  tne  condition  for 
(11,12)  is  ({  x  '  x  c  capital  letters)  -  i x ! x  e  capital 
letters  on  otner  transitions))  U  {x|x  «  numeric),  and  tne 
condition  for  (11,13)  becomes  {  z  !  z  e  ({actual  capital 
letters  seen)  0  {small  letters))).  In  tnis  example,  it  was 
assumed  tnat  tne  sets,  numeric  and  snail  letters,  were  an 
appropriate  eeneraiization  for  tne  transition.  In  practice 
it  cannot  be  done  witnout  consideration  of  tne  numoer  of 
elements  tnat  nave  been  seen  from  tne  set  on  tne  transition. 
If  tne  count  field  for  tne  set  exceeds  a  tnresnoid  value  for 
tne  set,  tne  seneralization  may  be  mane,  otnerwise  tne 
elements  tnemseives  become  tne  partial  condition  for  tne 
transition. 

After  a  condition  nas  been  constructed  for  a 
transition,  a  final  strong  generalization  technique  is 
employed.  Tne  Active-sets  value  for  tne  transition  again 
supplies  tne  starting  point  for  tnis  tecnnique.  Notice 
adjacent  bits  in  Active-sets  correspond  to  adjacent  nodes  in 
tne  nierarcny.  Therefore,  a  cnees  is  made  of  tne  Active-sets 
to  see  if  it  nas  adjacent  bits  wltn  a  value  of  one.  If  it 


95 


AD-A104  586  NAVAL  POSTGRADUATE  SCHOOL  MONTEREY  CA  F/G  9/2 

CONDITION  RECOGNITION  FOR  A  PROGRAM  SYNTHESIZER. (U) 

JUN  81  J  S  LAPE»  C  W  MILLER 


unclassified 


NL 


does  then  a  generalization  may  oe  attempted.  Assume  tne 
condition  (({capital  letters)  {A  E  I  U  U)^  U  (small 

letters)  U  {numeric)}  nas  been  constructed  for  some 
transition.  Tne  Active-sets  value  for  tnis  transition  must 
be  seven  (0111  binary).  With  tne  default  nierarcny  in  Figure 
62,  a  generalization  to  Alphabetic  and  tnen  to  Aipna-numen c 
would  be  attempted.  Notice  tnat  a  generalization  to 

Aipna-numeri c  would  fail  because  of  a  conflict  with  anotner 
transition.  Intuitively  ({aipna-numerlc)  -  {A,  E,  I,  3,  U}) 

would  be  a  correct  cnoice  ror  tne  ~ondition  for  tnis 

transition.  A  general  procedure  for  tne  construction  of 
generalized  conditions  is  given  below. 

A  set  of  nodes  Y  =  {y(  ,  Y2  ,  . ..,  y„  )  is 

generalizable  to  a  node  X  if  tne  set  of  node  i  form  a 
complete  and  exnaustive  set  of  leaves  to  tne  subtree  rooted 
at  X.  Furtner,  a  set  of  nodes  Z  =  {z,  f  zz  ,  ...»  zm  )  is 
generalizable  to  the  set  V  =  {w,  ,  w2 ,  ...  ,w-  ),  j  <  m,  wnere 

J 

eacn  w  is  a  generalization  of  a  subset  Z. 

IF  the  condition  =  Ff  0  II  ...  II  Fp 
where  R  =  Z(  -  q{  ,  1  =  lfn 

where  C  Zi  (qt- possibly  null) 

THEN 

tne  condition  is  set  to  W  -  U  q; 

I4i*n  ' 

wnere  W  is  tne  smallest  set 

W  a  {Wj  »  •••  *  ) 

sucn  tnat  *  generalizes  {zf  ,  z^ ,  ...  »  z„) 


C.  DESIGN  FOR  A  CONTEXT  SENSITIVE  ENVIRONMENT 


1.  Overview 

Condition  generation  In  tae  context  sensitive 
environment  is  a  more  difficult  tast  tnan  m  tne  context 
free  environment.  Tnis  difficulty  arises  from  tne  scope  of 
Knowledge  required  to  mane  decisions  on  wnat  a  condition  is 
to  ne.  Tne  conditions  taemseives  are  more  complex  because 
tney  depend  not  only  on  tne  cnaracter  tnat  is  Deing  seen, 
Dut  also  depend  on  characters  tnat  precede  and  follow  the 
current  cnaracter  under  consideration.  Tne  following  example 
will  be  used  to  illustrate  tne  difficulties  and  our  solution 
to  this  problem.  Assume  a  user  wishes  to  capitalize  all 
occurrences  of  tne  word  'time'  in  some  text  file.  Also 
assume  that  the  word  occurs  at  the  beginning,  at  the  eni, 
and  in  tne  middle  of  sentences  in  tne  text  file.  Tne 
question  is  now  to  construct  a  program  tnat  performs  tne 
desired  function  given  only  tne  actions  tne  user  performs  as 
an  example  of  tne  required  program. 

Tne  assumption  about  the  position  of  the  word  'time' 
in  tne  text  file  implies  tnat  me  requested  action  needs  to 
be  accomplished  on  strings  that  nave  very  different 
characteristics.  Certainly,  botn  'time'  and  'Time'  should  be 
capitalized  as  should  'time,'  ,  'time?'  and  'time<sp>'.  On 
the  other  hand  tne  string  'time'  should  not  be  capitalized 
when  it  occurs  within  a  word  iiixe  'sometime'  or  'timely'. 


97 


Any  generated  program  tnat  benaves  as  described 
above  must  be  able  to  recognize  an  occurrence  of  tne  string 
or  some  variation  of  tne  string.  Tne  totality  of  tnis 
Information  must  be  glued  togetner  to  provide  a  single 
condition  tnat  is  descriptive  of  what  tne  surrounding 
environment  must  be  lite  before  tne  action  is  performed.  Tne 
implication  is  tnat  tne  condition  itself  must  oe  able  to 
perform  cnecsing  and  loon-anead.  In  otner  words,  tne 
condition  for  tne  transition  to  tne  operation  must  in  fact 
oe  a  procedure  wnicn  responds  'true'  whenever  tne  string  of 
interest  is  recognized.  Assume  for  tne  present  tnat  tne 
string  of  interest  can  be  discerned  from  tne  user's  actions, 
(a  hard  problem  by  Itself,  see  Angiuin  [iyj  )  one  must  wonder 
now  sucn  a  procedure  can  be  constructed  and  men  inserted 
into  tne  generated  program  wnicn  performs  tne  function  of  a 
condition  on  some  transition  in  tne  program.  Figure  35  snows 
a  procedure  which  recognizes  tne  word  'time'.  Note  tne 
robustness  of  the  procedure  in  tnat  it  distinguishes  between 
the  differing  occurrences  of  'time'  as  mentioned  above. 
Figure  35  points  out  that  tne  problem  is  r.ot  Just  generating 
a  procedure  as  a  condition  but  also  generating  conditions 
within  the  procedure  that  is  to  be  tne  overall  condition. 
Tne  arcs  labeled  'T  v  t '  and  '<SP>  v  {punctuation} '  snouid 
be  noted  with  interest  because  they  provide  tne  robustness 
tne  condition  procedure  needs.  Tne  discovery  of  arc  labels 
for  the  condition  procedure  will  be  discussed  next. 
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Figure  35.  Condition  for  "tine"  and  "Time 
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Tne  monitoring  of  user  actions  provides  the 
instruction  and  cnaracter  sequence  in  tne  same  manner  as 
done  in  tne  context  free  mode.  A  consideration  was  given  to 
require  more  information  be  provided  by  tne  monitor, 
nowever,  tne  notion  was  discarded  because  it  would  reouire 
tne  user  to  be  aware  of  tne  functioning  of  tne  preprocessor. 
Requiring  tne  user  to  provide  information  to  tne  system 
would  betray  our  goal  for  tne  system.  Tne  user  snouid  only 
be  requirei  to  Initiate  tne  system  and  tnen  perform  editing 
as  if  tne  system  was  not  actively  monitoring  nis  actions,  we 
feel  the  requirement  of  specifying  whether  tne  user  wants  to 
perform  context  free  or  context  sensitive  operations  is  tne 
maximum  that  snouid  be  asted.  If  it  were  feasible  to 
recognize  tne  difference  between  tne  two  modes  from  tne 
user's  actions  alone,  tnis  limitation  would  be  also  removed. 

Given  only  tne  instruction  sequence,  tne  cnaracter 
sequence,  and  tne  information  of  a  context  sensitive 
environment,  tne  first  assignment  of  tne  context  sensitive 
preprocessor  is  to  discern  tne  string  of  characters  upon 
whlcn  some  operation  is  to  be  performed.  Tnis  is  a  pattern 
recognition  problem  of  considerable  difficulty.  Angluin  [19J 
provides  the  following  theorem,  "There  is  an  effective 
procedure  wnicn,  wnen  given  a  sample  S  as  input,  outputs  a 
pattern  p  wnicn  Is  descriptive  of  Si".  The  sample  S  is  a 
subset  of  tas  set  of  all  strings  over  tne  alpnabet  of  tne 
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language.  Tne  effective  procelure  is  computationally 
expensive  and  not  lmpienentationaiiy  desirable  for  our 
system.  Tne  procedure  is  an  enumeration  tecnnique  on 
patterns  witn  a  lengtn  less  tnan  tne  snortest  example  in  the 
sample  set  S.  Eacn  of  tne  enumerated  patterns  is  tested  to 
determine  if  it  is  descriptive  of  tne  entire  set  S.  The 
longest  pattern  tnat  is  descriptive  of  S  is  tne  most 
specific  pattern  for  tne  set.  Clearly,  as  tne  lengtn  of  tne 
of  tne  sample  grows,  tne  numDer  of  enumerated  patterns  will 
grow  exponentially.  Angluin  [l9j  states,  "in  tne  general 
case,  tne  test  performed  on  tfte  patterns  is  an  NP-compiete 
problem.”.  Tne  test  sne  is  referring  to  is  tne  cnecic  to  see 
if  tne  enumerated  pattern  is  descriptive  of  S. 

For  implementation  purposes,  we  need  a  mechanism 
that  falls  well  snort  of  tne  exponential  cenavior  of  tne 
effective  procedure  mentioned  above.  Tne  text  editing  domain 
nas  two  types  of  instructions  for  tne  purpose  of  this  paper. 
The  first  type  of  instruction  will  he  called  cursor 
positioning  instructions  wniie  tne  second  type  win  re 
called  data  manipulating  instructions.  Assuming  tne  text 
file  is  to  be  represented  as  a  linear  array,  only  one  cursor 
position  Instruction  need  concern  us.  All  cursor  positioning 
commands  such  as  move  left,  move  up  or  move  down  can  be 
represented  as  move  rignt  instructions.  Data  manipulation 
instructions  operate  on  one  character  and  do  not  reposition 


tne  cursor 


Tne  method  we  nave  adopted  for  determining  tne 
string  of  interest  and  tne  context  of  tne  string  is  based  on 
tne  above  definition  of  tne  types  of  instructions  available 
in  tne  text  editing  domain.  Tne  preprocessor  scans  tne 
instruction  sequence  looKine  for  an  occurrence  of  a  data 
manipulation  instruction.  Tne  Character  associated  with  tnis 
instruction  is  then  taiten  as  tne  first  cnaracter  of  tne 
string  of  interest.  Otner  characters  are  added  to  the  strir.e 
by  continuing  tne  scan  until  multiple  occurrences  of  cursor 
positioning  instructions  are  encountered .' A  nypotnesis  is 
then  constructed  consisting  of  three  parts.  Tne  first  part 
is  tne  beginning  context.  It  is  constructed  from  tne 
characters  tnat  preceded  tne  string  in  the  cnaracter 
sequence.  Tne  second  part  is  tne  string  itself  and  tne  final 
part  is  the  ending  context  constructed  from  tne  characters 
seen  after  tne  string.  For  engineering  considerations,  tne 
number  of  characters  in  the  beeinnicfr  and  ending  context 
will  be  limited  to  twenty  characters.  The  probability  of  the 
context  exceeding  twenty  characters  on  botn  sides  of  tne 
strinsr  in  tne  text  editing  domain  is  small  enousn  to  ignore. 

Once  a  hypothesis  is  proposed  it  is  set  aside  as  an 
active  hypothesis  and  scanning  of  the  input  continues.  Otner 
cases  of  data  manipulation  instructions  surrounded  by  cursor 
positioning  Instructions  will  result  in  otner  nypotnesis 
being  constructed.  As  these  nypotnesis  are  added  to  tne 
active  nypotnesis  list  tney  are  cnecnea  for  consistency  and 


if  tile  new  nypotnesis  causes  conflicts  tney  are  resol 
constructing  anotner  nypotnesis  from  tne  confi 
nypotnesis.  To  demonstrate  tnis  mecnanism  we  pres 
example  which  will  illustrate  tne  generation  of  nypo 
and  resolution  into  a  condition  function.  Tne  exampl 
is  tne  construction  of  tne  function  wnicn  will  recopni 
string  'time'. 

Suppose  tne  text  file  contained  tne  roi 
sentences  somewnere  in  tne  file. 

The  time  is  two  ocloce. 

It  is  time  to  go  to  tea. 

Time  tne  runner. 

Did  you  run  out  of  time? 

Also,  suppose  tne  user  nas  specified  tne  environment 
be  context  sensitive  and  nas  oeeun  to  perform  actions 
file.  The  monitor  could  create  tne  following  instructl 
character  sequence  fragments  from  tne  user  moving  t 
tne  text  file  and  capitalizing  tnese  occurrences  of  't 

(RRRRCRCRCRCRRRR  ...) 

(Tne  tTi ImMeS  is  . . . ) 

(RRRRRRCRCRCRCRRRR  . ..) 

(It  is  tTi I.mMeJS  to  . . . ) 

(RCRCRCRRRRR  ...) 

(TilmMeE  tne  . . . ) 

(...  RRRRRRRRRRRCRCRCRCRR) 

{...  run  out  of  tTilmtfeE?) 


vea  by 
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is  to 
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Tnis  example  is  not  to  imply  tne  user  must  change  ail 
occurrences  in  tne  text  file  out  tie  snouid  provide  enougr. 
examples  from  ttie  file  to  insure  his  desires  are  understood. 
If  tne  user  nas  not  supplied  a  distinguishing  set  of 
examples  and  an  incorrect  program  is  generated  ne  may  add  to 
tne  set  of  examples. 

Scanning  tne  first  instruction  sequence  until  tne 
first  data  manipulation  instruction  results  in  tne  string 
'time'  beine  constructed.  Tfte  resulting  nypotnesis  is  tnat 
tne  string  'time'  is  witnin  tne  context  of  'Tne<^sp>'  and 
'<sp>  is  two  oclocK.'.  Tne  nypotnesis  may  oe  viewed  as  tne 
following  data  structure. 

Hypotnesis  l: 

Begin  context:  Tne<sp> 

String:  time 

End  context:  <sp>l$  two  ocioctr. 

A  second  nypotnesis  would  be  generated  for  tne  next  portion 
of  tne  instruction  sequence  as  snown  below. 

Hypotnesis  2: 

Begin  context:  It  is<sp> 

String:  time 

End  context:  <sp>to  go  to  bed. 

A  comparison  of  tnese  two  nypotneses  indicates  a 
disagreement  between  tne  contexts.  Tne  conflict  is  resolved 
by  determining  tne  longest  beginning  and  ending  context  tnat 
agree  between  tne  two  nypotneses  and  generate  a  nypotnesis 
reflective  of  tnis  agreement.  By  wording  bacKward  from  tne 
last  cnaracter  in  tne  begin  context  for  botn  nypotneses,  it 
is  possible  to  ascertain  tnat  tne  only  character  in 
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agreement  is  tae  space.  Wonting  forward  from  tae  first 
cnaracter  in  tae  end  context  for  both  hypotheses,  again  only 
caaracter  in  agreement  is  tne  tae  space.  A  taird  nypotnesis 
witn  tne  new  Degin  and  end  contexts  is  generated  as  follows: 

Hypotaesis  3: 

Begin  context:  <sp> 

String:  time 
End  context:  <sp> 

Tals  aypotnesis  specifies  tnat  tne  string  'time' 
must  be  preceded  and  followed  by  a  space.  Note  tne  test  of 

tne  nypotnesis  implies  tae  user  is  allowed  to  specify  ore 

string  during  an  example  computation.  It  is  also  implied 
tnat  tftere  must  be  a  begin  and  an  end  context  for  tae 
string.  Since  it  is  possible  to  nave  two  nypotneses  wnere 
one  of  tae  context  strings  do  not  agree  in  any  of  tne 

characters,  a  metnod  must  exist  to  provide  tne  appropriate 
context . 

Whenever  tne  comparison  between  context  of  two 

nypotneses  results  in  tne  null  string,  a  disjunction  is 
formed  from  tae  characters  immediately  next  to  tne  string. 
For  example,  tne  instruction  sequence  given  above  would  give 
tne  hypothesis: 

Hypotaesis  4: 

Begin  context:  Did  you  run  out  of<sp> 

String:  time 
End  context:  V 

A  comparison  between  nypotnesis  3  and  nypotnesis  4 
would  result  in  tne  null  string  for  tne  end  context.  Since 
there  must  be  an  end  context,  the  disjuction  of  <sp>  and  ? 
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is  forced  and  tnis  become  tne  end  context  for  tne  new 
nypotnesis.  £>eneralizat ion  tecnniques  tnat  were  mentioned  in 
tne  section  on  context  free  environment  ar’  tnen  applied  in 
an  attempt  to  reduce  tne  end  context  to  tne  most  general 
context  consistent  wltn  tne  data  seen.  Tne  only  alteration 
in  tne  generalization  scneme  is  tne  lowering  of  tne 
tnreshold  values  for  important  sets.  In  tnis  example,  tne 
tnresnold  value  for  the  punctuation  set  would  be  lowered  to 
1  and  tne  endinar  context  would  become  {  x!  x=spece  or  x  C 
{Punctuation} } . 

Tne  final  problem  to  oe  solved  is  tne  recognition  of 
variations  in  a  strine.  Examples  of  variations  of  a  string 
are,  'Time'  and  'time',  or  'enclosure'  and  'inclosure'.  As 
mentioned,  if  tne  user  intends  to  capitalize  ail  occurrences 
of  'time',  'Time'  is  to  be  included.  Note  tnese  variations 
of  tne  strine  become  tne  compound  labels  for  tne  arcs  in 
Flsrure  35.  Tne  system  includes  a  rule  tnat  enables  tne 
recognition  of  variations  of  strings  provided  tne  user  gives 
an  example  of  tne  variation.  Tne  rule  simply  states  tnat  tne 
string  lengtn  will  oe  estabiisnea  to  oe  as  long  as  tne 
longest  string  encountered  during  processing.  Again,  using 
tne  example,  tne  nypotnesis  for  'Time  tne  runner.'  would  be: 

Hypotnesls  5; 

Beeln  context :  ...  T 

String:  lme 

End  context:  <sp>tne  runner. 

It  nas  been  establisned  by  preceding  user  actions 
tnat  tne  string  lengtn  for  tne  nypotnesis  snouid  be  4.  2y 


^atcnin?  the  pattern  in  hypothesis  5  with  the  string  irom 
nypotnesis  4  it  can  be  determined  tnat  tne  string  in 
Hypothesis  b  should  be  expanded  Dy  inserting  a  'T'  in  front 
of  tne  string.  Anotner  nypotnesis  is  tnen  generated  wr.ere 
the  string  will  ce  tne  disjuction  between  tne  strings  'time' 
and  'Time'.  Tne  final  nypotnesis  from  the  example  would  then 
oe: 

Hypotnesis  o: 

Begin  context:  <sp> 

String:  'time'  v  'Time' 

End  context:  i  x|  x  =  space  or  x  c  Pune.} 

Once  tnis  nypotnesis  nas  teen  generated,  it  is  tnen 
used  to  examine  tne  input  for  negative  examples  that  can 
strengtnen  or  weaken  tne  nypotnesis.  Suppose  tne  input 
contained  the  fragment  ”...  timely  results...”  .  Processing 
the  input  with  Hypotnesis  6  would  snow  a  maten  for  tne 
string,  but  tne  end  context  would  not  agree;  therefore,  tne 
nypotnesis  will  be  strengthened  by  changing  tne  end  context 
as  snown  below: 

Final  Hypothesis: 

Begin  context :  <sp> 

String:  'time'  or  'Tine' 

End  context:  ix|x=space  v 
x  c  Pune . 

x  e  small  letters} 

After  the  input  has  been  processed  and  a  final 
nypotnesis  proposed,  tne  hypotnesis  is  used  to  construct  a 


procedure 

suen 

as 

snowo 

in  Figure  35.  Tne 

first  part 

0  f 
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to 

be 
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context 
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Instructions  In  tne  instruction  set,  and  tfte  arc  labels 
consist  of  tne  information  in  tne  final  nypotaesis.  A.  start 
state  is  placed  in  tne  procedure  vitn  an  arc  to  a  move  rignt 
instruction  (R).  Since  tne  procedure  is  a  string  matcn  or 
loon-anead  routine  all  states  otner  tnan  tne  start  state 
will  be  move  right  instructions.  Each  of  tne  states  will 
nave  two  arcs  exiting  tnem.  Tne  labels  on  tnese  two  arcs 
will  be  tne  negation  of  tne  eacn  otner. 

Tne  construction  is  accomplisned  by  placing  tne 
first  cnaracter  of  tne  begin  context  on  tne  exiting  arc 
going  to  a  new  move  rignt  state.  Tne  otner  arc  is  labeled 
witn  tne  negation  of  tne  cnaracter  and  tnis  arc  terminates 
at  tne  first  move  rignt  state.  Sacn  cnaracter  of  tne  begin 
context  creates  anotner  move  rignt  state  labeled  as 
mentioned. 

Tne  string  from  tne  nypotnesis  is  tnen  used  to 
complete  tne  procedure  tnat  nas  been  partially  constructed. 
If  tne  string  is  composed  of  disjunctions,  tne  cnaracters 
are  used  to  form  disjunctions.  Eacn  of  tne  disjunctions  are 
combined  witn  conjunctions.  Tne  final  aypo  tnesis  above 
provides  a  string  of  'time'  or  'Time'.  Tne  conjunction  of 
disjunctions  will  be  formed  as: 

('T'  v  't')  &  ('i'  v  '1')  &  ('m'  v  'm')  &  ('e'  v  'e') 

Upon  reduction  the  string  will  be  expressed  as: 

('T'  v  't'}  &  '1'  &  'm'  &  'e' 

Eacn  disjunction  becomes  a  label  on  an  arc  to  a  new  move 
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rignt  state  anl  tne  negation  oecomes  tne  iaDei  on  ar.  arc 
oacJt  to  tne  original  move  rignt  state. 

Finally,  tne  end  context  is  added  in  tne  same  manner 
as  tne  negin  context.  Tne  first  cnaracter  oecomes  tne  latei 
on  tne  last  move  rignt  state  created  from  tne  string  and  new 
states  are  aided  for  eacn  cnaracter  in  tne  end  context.  Tne 
result  of  tnese  operations  is  displayed  in  Figure  35. 
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IV.  CONCLUSIONS  AND  HECtWEN  CAT  IONS 

A.  STNTHBSIZER 

Tne  syntneslzer  tnat  nas  been  implemented  for  tnls 
thesis  will  produce  programs  from  example  computations  in  a 
reasonaDie  amount  of  time.  Tne  system  response  for  most  of 
tne  traces  was  within  10  seconds  on  a  Digital  Equipment 
Corporation  PDP-ll/f0  minicomputer.  Tne  response  time  is  a 
function  of  tne  lengtn  of  tne  trace  and  tne  numcer  of 
multiple  occurrences  of  a  particular  instruction  or  set  of 
instructions  in  tne  final  algorithm,  with  multiple 
occurrences  of  an  instruction  affecting  response  time  tne 
most.  As  Blermann  [17J  nas  noted,  tnis  has  a  nice 
implication  for  proxrrammine  by  example  because  most 
algorithms  do  not  exnibit  tne  characteristic  of  having  a 
laree  number  of  instances  of  tne  same  instruction.  In  other 
words,  almost  all  multiple  occurrences  of  an  instruction  in 
an  input  trace  are  indicative  of  a  loop  in  me  algorithm. 

In  all  of  tne  test  cases  except  those  tnat  require?  a 
large  amount  of  bacnups,  static  processing  accounted  for  at 
least  naif  of  tne  total  response  time.  Future  modi r ica tions 
to  tne  syntneslzer  wnicn  would  decrease  tne  total  response 
time  could  be  directed  toward  desienin*r  tne  static 
processing  stage  more  efficiently.  However,  tne  trade-off 
between  static  processing  and  dynamic  processing  must  be 
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*ept  in  perspective.  Static  processing  is  a  linear  function 
of  tne  length  of  tne  trace,  whereas  dynamic  processing, 
since  it  is  an  enumerative  searcn  technique,  is  an 
exponential  function  of  tne  length  of  tne  trace. 

Another  area  which  should  he  considered  is  the  dynamic 
processing  stage.  Tnere  exists  a  pietnora  of  researcn 
auestions  within  this  area.  The  primary  one  being:  Can  mere 
information  be  gleaned  from  tne  input  trace  during  static 
processing  wnicn  will  decrease  the  search  time  for  dynamic 
processing?  Difference  sets  and  couple-classes  provide  sot? 
powerful  mecnanisms  for  decreasing  tne  amount  of  searcn; 
nowever,  lower  bounds  computations  on  the  numoer  of  states 
required  by  tne  macnlne  often  increase  tne  amount  of  searcn. 
Lower  bounds  are  restrictive  in  nature.  They  are  designed  to 
force  tne  final  algorithm  into  a  minimum  state  configuration 
which,  in  many  cases,  causes  extra  search  time.  Relaxation 
of  the  lower  bounds  computation  will  result  in  a  final 
algorltnm  wnicn  may  not  be  expressed  in  a  minimum  number  of 
states,  but  which  will  still  oe  deterministic.  Tnere  right 
be  better  methods  of  initially  computing  tne  r.u-roer  of 
states  which  would  result  in  a  closer  estimate  of  tne  actual 
number  of  states  required  for  tne  aigoritnm.  Obviously,  the 
closer  tne  initial  guess  is  to  tne  actual  requirement,  tne 
less  backup  incurred,  and,  tnerefore,  the  less  searcn  tiTe 
requi red. 


Since  tne  amount  of  searcn  require!  is  governed  by  tne 
failure  memory  entries,  tne  mors  dense  tne  failure  memory 
can  be  male,  tne  more  directed  tne  searcn  becomes.  So 
anotner  area  for  researcn  is  to  determine  if  more 
information  exists  in  tne  failure  memory  entries  tnan  is 
currently  being  used.  How  mucn  information  do  tne  structure 
factor  and  the  free  state  factor  provide?  Is  tnere  anotner 
factor  wnicn  would  be  useful? 

Finally,  a  more  general  question  can  oe  addressed.  Tre 
underlying  structure  of  this  tecnnique  is  an  enumerative 
searcn.  Can  the  tecnnique  be  generalized  to  include  otner 
algorithms  wnicn  are  enumerative  in  nature?  What 
modifications  to  tne  failure  memory  are  needed?  How  would 
difference  sets  and  couple-classes  be  redefined? 

3.  CONDITION  PROCESSING 

The  condition  processor  front-end  to  tne  synthesizer 
relieves  tne  user  from  worrying  about  some  of  tne  control 
structure  considerations  by  automatically  generating 
conditions.  Anotner  addition  wnicn  would  increase  tr.e  power 
of  tne  syntnesizer  Is  an  automatic  loop  variable  generator 
as  discussed  by  Biermann  L10J  .  Although  tne  text  editing 
environment  nas  been  used  in  tnis  tnesis  wors,  tne  part  of 
the  condition  processor  design  wnich  deals  witn  a  context 
free  environment  is  general  enougn  tnat  it  could  be  designed 
to  operate  in  any  domain. 
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Condition  generation  in  a  context  sensitive  er vi renmer. t 
is  a  mum  narder  prooism  furtner  complicated  by  requisite 
pattern  matching  ana  pattern  generation.  Before  inis  type  cf 
condition . veneration  can  be  generalized,  mucn  wont  nas  to  te 
done  to  incr'eas-e  tne  efficiency  of  pattern  veneration 
scnemes.  Angiuin  [iyj  nas  snown  a  pattern  generation  scneme 
which  is  a  polynomial  time  algorithm  for  pattern  veneration 
vltn  one  variaoie,  out  tne  domain  we  nave  examined  will 
require  at  least  two  variables.  Tnere  is  not  a  polynomial 
time  algorltnm  for  pattern  generation  witn  two  variables. 
Heuristic  tecnniques  will  probably  be  necessary  to  provide 
methods  of  pattern  generation  wnicn  will  be  fast  enouvn  to 
be  useful  over  a  wide  range  of  problems. 
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/*  If  free  states  are  available,  use  one  of  tnem  and  continue 
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/*  AddState  Increases  tne  bound  on  tne  number  of  states  allowed 
for  a  particular  Instruction  wnen  tnere  are  free  states 
available  to  the  machine  and  the  Instruction  at  the  current 
level  needs  an  extra  state  in  order  for  tne  machine  to  remain 


/*  ATOI  is  used  to  provide  a  debug  level  for  the  program  */ 
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